Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

@thunderbird120 asked a Stability employee and say that the plan is going to keep training the models up to 1.5T. So I don't know where do you read this.


That may be, but the weights you can download today were trained on 800B


I think they are “checkpoint” models in this case.

Will be fun to compare when completed!


Are not all models checkpoints? I think you may be interpreting it too colloquially.


yes of course that's why they use "will be trained" on the GH repo.


https://github.com/Stability-AI/StableLM#stablelm-alpha shows that the 3b and 7B had 800b training tokens.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: