Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Did you know the latest o1 model can count r's in strawberries? It just needed 5 mimutes to think about it, that was all in the end.


Could it be that since the question became a big benchmark, it (along with the correct answer) slipped into the training data?


Ask it "What is the length of the response to this question"


This worked after 10 seconds:

>The length of this response is 45 characters.


That may need 10 minutes to think!


It needs 4 seconds to answer correctly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: