How can it be economically viable to still run them? You can get 100x the output...

dragonwriter · 2026-05-25T02:47:35 1779677255

While the 100× is, I think, rather hyperbolic, there is a real and large efficincy difference, but its economically viable to run them because the supply of newer GPUs is insufficient to meet the demand for compute, so they can charge enough to cover costs for the old ones and a premium (relative to operating costs) for the newer ones.

It would be economically unviable to run the older ones if the supply of newer ones were unconstrained, but that’s not the world we live in.

Dylan16807 · 2026-05-25T03:21:46 1779679306

Going by the stats on wikipedia, T4 and B300 both do about one teraflop of half-precision math per watt? Where are the efficiency gains?

Edit: It looks like they replaced INT8 and INT4 with FP8 and FP4, with the same speedups of 2x and 4x relative to FP16. That's an improvement but not that big of an improvement.

Ekaros · 2026-05-25T07:59:45 1779695985

As long as you have customers that are willing to pay more than it cost you are fine. And with AWS seemingly there is plenty of those. So question isn't is this most efficient way but will someone pay at price that is above what new hardware could attain.

Marsymars · 2026-05-25T00:27:56 1779668876

Presumably people using AWS are paying more than they cost to run, and AWS has finite bandwidth to upgrade things due to personel, etc.

jmalicki · 2026-05-25T00:08:14 1779667694

Good question!

Maybe the capabilities of newer GPUs allow AWS to charge higher margins for them? I don't actually know.

HDBaseT · 2026-05-25T02:14:10 1779675250

There has not been a "100x" in efficiency in the past 6-8 years.