Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

All this sounds to me like mathematicians spooking themselves with stories of how ChatGPT solved a problem, when it's mathematicians solving a problem using ChatGPT as a tool. E.g. from the twitter thread by Timothy Gowers:

>> All I did was say things like, "Yes, it would be great if you could explore that idea and see whether you can get it to work," or "Could you rewrite that argument as a LaTeX file in the style of a standard mathematical preprint?"

Yeah, so all he did was take the horse to the water and make the horse drink. The collaboration with the other two mathematicians wasn't a trivial part of the problem solving either: every time Timothy Gowers figured ChatGPT had goe somewhere with its problem-solving, he stopped, asked it to render the answer in LaTex, and sent the answer off to be verified by the other two.

The reason for that is not to be underestimated: ChatGPT can produce answers to questions you ask it for as long as you ask it to do so but it has no capability to determine whether an answer is correct or not. That's why it needs a human with domain expertise to evaluate those answers. And of course to discard wrong answers in the process, because of course the process that's described here glosses over many false starts and back-and-forths and "you're absolutely rights, here's a new version of that"'s etc. that are common experience when using LLMs for problem-solving tasks.

The existential questions that the article poses about mathematics then are easily answered by taking all of the above into account. If LLMs are a useful tool for mathematicians, then nothing changes. Mathematicians of all levels can still do their job and perhaps do it faster or better with the new tool.

If you can sic ChatGPT on a mathematics problem and it can solve it without your input, that's a different matter but that's not what's happening.



Sorry, just so I fully understand your comment - your claim is that asking it to “explore that idea further” and “write the paper in latex” constitutes “taking the horse to the water and making the horse drink”?

thank you for the morning laugh



>If you can sic ChatGPT on a mathematics problem and it can solve it without your input, that's a different matter but that's not what's happening.

I mean that has happened so yeah ?

https://www.scientificamerican.com/article/amateur-armed-wit...

Actual GPT transcript. Zero such input https://chatgpt.com/share/69dd1c83-b164-8385-bf2e-8533e9baba...

And maybe the other guy wasn't the most polite about it but his point is very valid. Replace chatgpt with a human in both of these stories and nobody would say that timothy 'took the horse and made it drink'. The 'Horse' would be the first and likely only Author so this just sounds like denial.

That there are multiple of these stories in the last few months by the latest set of models (there are even more than these 2) should provoke this sort of consideration and discussion.


These are different cases, yes? The person in the SA article you link is described as an "amateur", but Timothy Gowers is not an amateur and he is much more capable of guiding an LLM with domain expertise than an amateur.

Then there's the kind of problem we're talking about. The "amateur" in the SA article solved one of Erdős problems and Gowers himself seems to think that, on its own, is not a cause for concern. He distinguishes his own result from that kind of earlier result at the start of his article:

>> The background is that, as has been widely reported, LLMs are now capable of solving research-level problems, and have managed to solve several of the Erdős problems listed on Thomas Bloom’s wonderful website. Initially it was possible to laugh this off: many of the “solutions” consisted in the LLM noticing that the problem had an answer sitting there in the literature already, or could be very easily deduced from known results.

So we have an "amateur" who "vibe-solved" an Erdős problem, on one hand, which may or may not already had a solutiuon lurking in the wings on the one hand; and an expert who solved a harder problem by interactive use rather than vibe-solving, on the other hand. There's no reason to believe that we can "Replace chatgpt with a human in both of these stories" as you say.

And btw there's scholarship that indicates vibe-solving is not yet ready to replace mathematicians like Timothy Gowers:

First Proof

To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now; the answers are known to the authors of the questions but will remain encrypted for a short time.

https://arxiv.org/abs/2602.05192

See Appendix A for initial results.


Yes these are different instances.

My first point is that I think you are overating 'interactive use' a bit here. Like Timothy already explains in the article, Were it a human he 'guided' in a similar way, he would not get credit for those achievements by any stretch of the imagination. And I think that's an important part of realizing why these sort of people are beginning to discuss these things.

Second. I didn't say anything about models being ready to replace mathematics wholesale. But should people really wait until that happens before discussing it? I know it's human nature to wait until the problem or situation is upon you but I don't think that would be prudent or wise. And even just for the sake of curiosity, it would be boring.

I think the matter of fact here is that in the last few months with the last few models, capabilities in this area have jumped to a very meaningful degree. It would be stranger if no one was talking about it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: