Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

So like many of the promises from AI companies, reported chain of thought is not actually true (see results below). I suppose this is unsurprising given how they function.

Is chain of thought even added to the context or is it extraneous babble providing a plausible post-hoc justification?

People certainly seem to treat it as it is presented, as a series of logical steps leading to an answer.

‘After checking that the models really did use the hints to aid in their answers, we tested how often they mentioned them in their Chain-of-Thought. The overall answer: not often. On average across all the different hint types, Claude 3.7 Sonnet mentioned the hint 25% of the time, and DeepSeek R1 mentioned it 39% of the time. A substantial majority of answers, then, were unfaithful.‘



I mean, obviously, it's not going to be a faithful representation of the actual thinking. The model isn't aware of how it thinks any more than you are aware how your neurons fire. But it does quantitatively improve performance on complex tasks.


As you can see from posts on this story, most people believe it reflects what the model is thinking and use it as a guide to that so they can ‘correct’ it. If it is not in fact chain of thought or thinking it should not be called that.


It is the same with human chain of thought, though. Both of them are post-hoc rationalisations justifying "gut feelings" that come from thought processes the human/agent doesn't have introspection into. And yet asking humans or machines to "think out loud" this way does increase the quality of their work.


I disagree - humans often reason in a series of steps, and can write these down before they've reached an answer. They don't always wait till they reach a conclusion (with no self-insight into how they did so) and then retrospectively generate a plausible answer as LLMs do.

In mathematical proofs they may guess and answer and then work out a proof, but that is a different process.


if its not a faithful representation of the actual thinking, why would they be scared of people distilling against it


Because even though it's not representative of the actual thought process, chain of thought improves model performance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: