Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Forgive my ignorance but aren't they already on huggingface?

I assumed turboquant optimizations are already everywhere - in llama-cpp, or the quantization machinery of unsloth and the likes.



I forked it to also add rotorquant. This is a specific optimization that uses clifford rotors instead of static compile time random purmutation to store the activations. Reduces space and parameter count for the storage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: