I've been wondering these same thoughts for years. I don't do much work in the neural network subfield, but have done a lot with computer vision, and always found myself wanting more robust physical estimation techniques that didn't require external data.