> but if we scale to twice the processing power, then start accepting twice the ... | Hacker News

Hacker Timesnew | past | comments | ask | show | jobs | submit

kaashif on June 18, 2024 | parent | context | favorite | on: What happens to latency if service time is cut in ...

> but if we scale to twice the processing power, then start accepting twice the request rate – we will actually be serving each request in half the time we originally did.

People usually add processing power by adding more parallelism - more machines, VMs, pods, whatever. In this case, the "blurted out" answer is correct.

If I take one second to serve a request on a machine then I add another machine and start serving twice the requests, the first machine doesn't get faster.

Maybe what you're saying is true if you make your CPUs twice as fast, but that's not usually possible on a whim.

kqr on June 18, 2024 [–]

It can still be true when scaling horizontally, depending on utilisation levels and other system characteristics, as well as the economics of errors.

Statistically speaking, I more often find the blurted out answer to be further from truth than 1 + rho.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact