Well HT is generally crap. In most cases if you go up to a random server, it is going to have HT. So cores / 2 will get you ballpark. You really should go in the bios and shut down HT if you are into latency at all. Then you can just do core or core - 2 threads.
So once you are there, the next step is to busy spin at a Thread Per Core (TPC). You have 10 cores, you find 10 (or more realistically 8-9 to leave the OS some spare cores to muck with) threads, and busy spin them at 100% doing work. You never let the linux scheduler touch them.
If this stuff interests you, there are some cool papers by lmax and the disruptor and gil tene that talk about this stuff..
The amount of work you can do with a single 10-20 core CPU is amazing.