@thunderbird120 asked a Stability employee and say that the plan is going to kee... | Hacker News

Hacker Timesnew | past | comments | ask | show | jobs | submit

GaggiX on April 19, 2023 | parent | context | favorite | on: StableLM: A new open-source language model

@thunderbird120 asked a Stability employee and say that the plan is going to keep training the models up to 1.5T. So I don't know where do you read this.

Taek on April 19, 2023 | [–]

That may be, but the weights you can download today were trained on 800B

sroussey on April 19, 2023 | | [–]

I think they are “checkpoint” models in this case.

Will be fun to compare when completed!

oehtXRwMkIs on April 20, 2023 | | | [–]

Are not all models checkpoints? I think you may be interpreting it too colloquially.

GaggiX on April 19, 2023 | | | [–]

yes of course that's why they use "will be trained" on the GH repo.

nickthegreek on April 19, 2023 | [–]

https://github.com/Stability-AI/StableLM#stablelm-alpha shows that the 3b and 7B had 800b training tokens.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact