Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

The M5 has 16 dedicated ‘Neural Engine’ cores and a ‘Neural accelerator’ in each of its conventional GPU cores. It’s been pretty special-purpose juiced for inference.
 help



When it comes to the very largest models the ANE seems to be only marginally useful for prefill. The M5 Neural Accelerators (NAX) help a lot but at a real cost wrt. power and thermals.

Yep, but Apple products don’t spend most of their time running huge models. They are running lots of little ones all the time, using hardware designed for that.

It seems that you're agreeing with what I wrote above. They ship a general-purpose stock system and tailor their compute offering towards that. Accelerating 'lots of little models' fits naturally into what they offer, in a way that a more compute-intensive design might not.

Yep, I misunderstood your point. Thanks for your patience. In my defense, the 'general purpose system' has a lot of model-inference-specific hardware. But not LLM-specific hardware.

If there's an M5 Ultra it'll be interesting to see what they've optimized it for.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: