While the 100× is, I think, rather hyperbolic, there is a real and large efficincy difference, but its economically viable to run them because the supply of newer GPUs is insufficient to meet the demand for compute, so they can charge enough to cover costs for the old ones and a premium (relative to operating costs) for the newer ones.
It would be economically unviable to run the older ones if the supply of newer ones were unconstrained, but that’s not the world we live in.
Going by the stats on wikipedia, T4 and B300 both do about one teraflop of half-precision math per watt? Where are the efficiency gains?
Edit: It looks like they replaced INT8 and INT4 with FP8 and FP4, with the same speedups of 2x and 4x relative to FP16. That's an improvement but not that big of an improvement.
As long as you have customers that are willing to pay more than it cost you are fine. And with AWS seemingly there is plenty of those. So question isn't is this most efficient way but will someone pay at price that is above what new hardware could attain.
You can get 100x the output with the same energy use.