Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Similar architectures have been available for a plenty of time! 256 bits at once with multiple execution units is a lot of compute power and has been the standard for a decade. Let alone SSE.


Similar is not the same.

SSE and AVX instructions are optimised primarily for 3D graphics, such as multiplying 4 floating point numbers with a 4x4 matrix. There are a handful of additional instructions optimised for doing things to pixels... and that's about it.

AVX-512 is designed to work more like what a GPU does internally, and provides a much richer set of instructions. It enables fine-grained masking and shuffles, without which many simple types of code are either impossible to compile, or much more complex... and slower. This is why auto-vectorisation with SSE an AVX are only enabled for some simple loops, and provide marginal benefits outside of those scenarios.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: