Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

This depends a bit on your cost sensitivity and what model families you want support for, but Baseten and Fireworks have been my goto.

Currently Baseten has ~610ms TTFT and ~82 tk/s for Kimi K2.6, which is roughly 2x the throughput of GPT-5.4 (per their openrouter stats). GLM 5 is slightly slower on both metrics, but still strong.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: