Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

The DS4 folks are unofficially testing ways to run the model with lower performance on lower-RAM machines. Similar efforts are going on with llama.cpp. The results are a bit of a challenge, prefill time tends to explode which is a limitation if you care about agentic workflows.
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: