the models: https://huggingface.co/stabilityai/stablelm-base-alpha-3b, https://h... | Hacker News

Hacker Timesnew | past | comments | ask | show | jobs | submit

GaggiX on April 19, 2023 | parent | context | favorite | on: StableLM: A new open-source language model

the models: https://huggingface.co/stabilityai/stablelm-base-alpha-3b, https://huggingface.co/stabilityai/stablelm-base-alpha-7b

There are also tuned version of these models: https://huggingface.co/stabilityai/stablelm-tuned-alpha-3b https://huggingface.co/stabilityai/stablelm-tuned-alpha-7b, these versions are fine-tuned on various chat and instruction-following datasets.

The Github repo mentions that the models will be trained on 1.5T tokens, this is pretty huge in my opinion, the alpha models are trained on 800B tokens. The context lenght is 4096.

bhouston on April 19, 2023 [–]

These models are huge. I assume they are not quantized down to 4bits yet.

brucethemoose2 on April 19, 2023 | [–]

Quantized versions will pop up on huggingface very soon, if they arent already there. It takes basically no time, much less than something like a alpaca finetune.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact