The single largest difference is simply the size of the networks. We are tossing billions of times more processing power at the same problems you see progress by simply change some constants.
PS: Of course there are also software / algorithm changes, but often same problem 1,000,000,000x the processing power just works.
PS: Of course there are also software / algorithm changes, but often same problem 1,000,000,000x the processing power just works.