> "A much bigger component is our innate (evolutionarily acquired) understanding of basic mechanics, simple agent theory, and object recognition. ... they are interpreting the image at a slightly higher level of abstraction by applying some assumptions and heuristics that evolution has "found"."
Of course, and all this is exactly what self-driving AIs are attempting to implement. Things like object recognition and understanding basic physics are already well-solved problems. Higher-level problem-solving and reasoning about / predicting behaviour of the objects you can see is harder, but (presumably) AI will get there some day.
Putting all of these together amounts to building AGI. While I do believe that we will have that one day, I have a very hard time imagining as the quickest path to self-driving.
Basically my contention is that vision-only is being touted as the more focused path to self-driving, when in fact vision-only clearly requires a big portion at least of an AGI. I think it's pretty clear this currently means this is not a realistic path to self-driving, while other paths to self-driving using more specialized sensors seem more likely to bear fruit in the near term.
Of course, and all this is exactly what self-driving AIs are attempting to implement. Things like object recognition and understanding basic physics are already well-solved problems. Higher-level problem-solving and reasoning about / predicting behaviour of the objects you can see is harder, but (presumably) AI will get there some day.