Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

SSE, NEON and AVX are fundamental for anything reducible to matrix and vector arithmetic (e.g., signal processing and ML inference on the CPU).

> It is hard to find simple enough code to vectorize at all.

It's hard to find code simple enough for the compiler to efficiently auto-vectorize.

Anything that reduces to GEMM will parallelize in practice extremely well, and there are many excellent libraries with SIMD support (MKL, BLAS, ATLAS, Eigen, etc.). However, these libraries rely on kernels carefully written by experts and benchmarked extensively over decades. They're not the output of running naively written code through a super smart compiler.

All of this is extremely relevant to what you bought your PC or phone for. It's also not in the kernel, and therefore Linus seems to be unaware of their pervasiveness and utility.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: