This is a very cool page. I love little simulations like this for building intui...

RMarcus · on May 20, 2024

This is my post from 2018 (I didn't submit it to HN), and it could definitely use a "here's what practical systems do" update! I'll put it on the TODO list...

Your point about systems dealing with a relatively small number of large objects vs. small objects also makes sense: this is essentially the "cost" of an overflow (4kb spills once in a blue moon? Oh well, handle that as a special case. 4TB spills once in a blue moon? The system might crash). This is more obvious, as you also point out, in load balancing.

One aspect I found very counter-intuitive: before this investigation, I would've guessed that having a large number of large bins makes overflow increasingly unlikely. This is only partially true: more bins is obviously good, but larger bins are actually more sensitive to changes in load factor!

Overall, I think you are right that this is not really a concern in modern systems today. Compared to Dynamo, I still think Vimeo's solution (linked at the bottom of the post) is both intuitive and low-complexity. But regardless, more of an interesting mathematical diversion than a practical systems concern these days.

anonymousDan · on May 20, 2024

Yeah there's a nice load balancing paper called the power of two choices that shows sending two requests gives surprisingly good results.