Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

EDIT: probably not relevant, after re-re-reading the comment in question.

Presumably littlestymaar is talking about all the LLM-generated output that's publicly available on the Internet (in various qualities but significant quantity) and there for the scraping.

 help



No, I'm talking about generating data for the purpose of training. The latest HF paper on the subject offers a great intro to the technique: https://huggingface.co/spaces/HuggingFaceFW/finephrase



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: