HN2new | past | comments | ask | show | jobs | submitlogin

The difference is the impact of contaminated datasets. Exam boards tend to reuse questions, either verbatim or slightly modified. This is not such a problem for assessing humans, because it is easier for a human to learn the material than to learn 25 years of prior exams. Clearly that is not the case for current LLMs.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: