HN2new | past | comments | ask | show | jobs | submitlogin

> 1) The linearly-probed robin hood variant is just a sorted array. That's it. It's just a sorted array. I haven't seen an article explain it so succinctly though.

It's not. Importantly, it leaves spaces so that you avoid what would be ~sqrt(n) probe lengths had you tightly packed everything. Also, it is a bit out of order as you wrap around the limits of the array (less of a big deal).

The sortedness property is really useful, btw. A RHH lets you get the perf of a good hash map for random access, and the throughput of a sorted list for merges / bulk processing.

> 3) There's a simple 40 year old bitwise trick by Knuth which allows one to combine the probe distance and the hash into a single value. This saves up to 4 bytes per table entry, but I haven't seen anyone use it.

When the values are sorted (modulo empties and wrapping) you don't need to store the probe length. You just maintain the invariant that things are sorted. Maybe I'm missing some detail of your implementation, but mine sure don't bother storing it.



>When the values are sorted (modulo empties and wrapping) you don't need to store the probe length. You just maintain the invariant that things are sorted. Maybe I'm missing some detail of your implementation, but mine sure don't bother storing it.

Storing the probe count allows the lookup function to have 2 branches per loop as opposed to 3. The probe count also contains enough information to distinguish occupied buckets from unoccupied ones. I'm convinced that combining the probe count with the hash is objectively the best way to implement such tables.


Can I convince you to downgrade to being subjectively convinced?

Storing hashes or probes wasn't faster for the use cases I evaluated. I can see reasons why you might like it, for your uses and perhaps "generally", but not in my case at least (small keys where storage bloat was worse than `fnv` recompute, and values which had invalid states which can be used to represent empties)

Edit: I'm re-reading my first response and it reads like "you never need/want the probe length", and I totally retract that position; it wasn't intended.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: