Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

It reminds me of how LLM hallucination is attributed to "I don't know" being underrepresented in training data, and it being a better strategy to guess on evaluations rather than admit not knowing.

Different reward function, but the same behaviour emerges.

 help



We'll see that improve as people move onto synthetic training data-- something now possible that we have sufficiently smart LLMs to create enough of it.

The idea is that you generate fake llm transcripts using your classical training data. E.g. look at some training data, generate q/a transcripts. Generate radom questions, RAG against your whole dataset and look for relevant stuff, if there is nothing there, train a "I don't know." reply.

A moderately sized LLM operating some tools to access more information behind the scenes, perform tests and correct its own errors can write transcripts simulating a much larger and smarter llm.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: