The M5 has 16 dedicated ‘Neural Engine’ cores *and* a ‘Neural accelerator’ in ea...

zozbot234 · 2026-06-06T23:39:35 1780789175

When it comes to the very largest models the ANE seems to be only marginally useful for prefill. The M5 Neural Accelerators (NAX) help a lot but at a real cost wrt. power and thermals.

robotresearcher · 2026-06-06T23:42:29 1780789349

Yep, but Apple products don’t spend most of their time running huge models. They are running lots of little ones all the time, using hardware designed for that.

zozbot234 · 2026-06-06T23:49:18 1780789758

It seems that you're agreeing with what I wrote above. They ship a general-purpose stock system and tailor their compute offering towards that. Accelerating 'lots of little models' fits naturally into what they offer, in a way that a more compute-intensive design might not.

robotresearcher · 2026-06-07T00:48:51 1780793331

Yep, I misunderstood your point. Thanks for your patience. In my defense, the 'general purpose system' has a lot of model-inference-specific hardware. But not LLM-specific hardware.

If there's an M5 Ultra it'll be interesting to see what they've optimized it for.