Hacker Times
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
GaggiX
on April 19, 2023
|
parent
|
context
|
favorite
| on:
StableLM: A new open-source language model
@thunderbird120 asked a Stability employee and say that the plan is going to keep training the models up to 1.5T. So I don't know where do you read this.
Taek
on April 19, 2023
|
next
[–]
That may be, but the weights you can download today were trained on 800B
sroussey
on April 19, 2023
|
parent
|
next
[–]
I think they are “checkpoint” models in this case.
Will be fun to compare when completed!
oehtXRwMkIs
on April 20, 2023
|
root
|
parent
|
next
[–]
Are not all models checkpoints? I think you may be interpreting it too colloquially.
GaggiX
on April 19, 2023
|
parent
|
prev
|
next
[–]
yes of course that's why they use "will be trained" on the GH repo.
nickthegreek
on April 19, 2023
|
prev
[–]
https://github.com/Stability-AI/StableLM#stablelm-alpha
shows that the 3b and 7B had 800b training tokens.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: