Despite this coming from an independent expert and not from OpenAI, we need to be honest that this is more like a marketing campaign than open science. I assume the progress really is valid, the experts are indeed impressed, and we have the accurate time that it takes to produce the results. What we don't have is detail about true cost, or a CoT trace, or anything like that.
The implication is: we're ready to let everyone go wild with this very soon. Ok, go wild with what exactly? How do we know that influential VIP users who might make very friendly blog posts aren't getting allocated exclusive access to a billion dollars worth of hardware when they ask questions? I mean literally giving a certain group of people temporary privileged access to like 90% of all available compute would be a completely reasonable business decision for OpenAI.
Would a reveal like that change how we think about the result? What if half that amount of cash/compute could enable some completely non-AI approach of numerical brute forcing that settles the question even if it didn't write the paper?
My other question is always whether the latest is purely using giant models or if we're now deeply into harnesses that use MCTS and such. Understandable to keep that a trade secret I guess. But IMHO we should at least get the CoT trace as a proxy for true cost, or else maybe we're just getting played to do the hype for corporate.
We know this because most of the Erdos proofs made by AI have been done by amateurs prompting GPT 5.* Pro, not by familiar names that OpenAI is sneaking additional compute to behind the scenes (which is too conspiratorial of an explanation for my liking regardless).
> We know this because most of the Erdos proofs made by AI have been done by amateurs prompting GPT 5.* Pro
Where's that? The stuff I've seen is from celebrities. Were those problems as hard as this one, or the ones that Tao posts about? Regardless.. what's the argument against more transparency here to just settle this kind of thing?
> which is too conspiratorial of an explanation for my liking regardless
OpenAI is not, in fact, open. Why do they deserve the benefit of the doubt?
Regardless.. special treatment for special customers isn't conspiracy, it's SOP literally everywhere and especially if you're helping to beta test. Anyone who's ever interacted with any technical account manager has seen waived quotas, free resource allocations, etc. The quid-pro-quo is obviously that your cheap early access means you get to give talks at a conference (or make a blog post that a lot of people read and talk about).
> The duo had jump-started the AI-for-Erdős craze late last year by prompting a free version of ChatGPT with open problems chosen at random from the Erdős problems website. (An AI researcher subsequently gifted them each a ChatGPT Pro subscription to encourage their “vibe mathing.”)
Wonder who the AI researcher worked for? Is a "craze" something which a for-profit company would want to encourage? Maybe they'd think the publicity would help keep people talking about their company and product as we are now doing?
> “There was kind of a standard sequence of moves that everyone who worked on the problem previously started by doing,” Tao says. The LLM took an entirely different route, using a formula that was well known in related parts of math, but which no one had thought to apply to this type of question.
Yep, I do remember this now. Everyone was yelling that this was definitely a sign of ground-breaking and creative work, citing the expert. What the expert actually said suggests that the solution was available in training data! That also suggests the math in TFA is harder in comparison, answering my other question.
Pleasure as always HN, thanks for voting me to the bottom of the thread for this
First, you ask for evidence of someone who isn't a VIP doing a similarly difficult problem using an LLM, to show that it isn't just VIPs being given special models. And then, when I provide that example, you say it doesn't count because the whole craze was started by researchers working for AI anyway.
Furthermore, you start out stating that the access being given to these VIPs is to insanely massive, impractical models no one will ever have access to, but then you point to them getting a free ChatGPT Pro sub as evidence of your point.
Finally, you look at the fact that the AI solved the problem by applying a technique that no one in sixty years had thought to apply to that situation, from a totally different field of mathematics, and you claim that that isn't sufficiently novel to "count" as being in the same ballpark of difficult as solving literally other easier Erdos problems, or this new problem, because the technique still existed previously, and so actually isn't hard enough to be comparable to all the other stuff that's been done.
If you are upset that you are being down-voted, I think you should do some introspection. It seems like it would be impossible to convince you, no matter how many non-VIPs solved difficult open math problems, as long as it wasn't literally the exact same level of difficulty or type of problem.
I'm grateful for the reference as context, but it simply does not settle the issue of transparency and it did reinforce the question. That's not controversial, nor is the statement that OpenAI avoids transparency, wants good publicity, will go to great lengths on both of these things like all corporations. It's also completely normal to think progress on Erdos problems is fascinating and inspirational, but aim for clarity on what the scope of the achievement really is (invention, composition, literature search) per the limited information available.
What I raised is a real question: the erdos folks are not affiliated with AI companies.. but is the AI company affiliated with them? Actually knowing the user/org accounts involved is optional because they could just divert resources to anyone prompting near the problem. Anecdotally.. I've noticed what appears to be token-discounts based on topic, for example more generosity for AI-related research than random stuff, but it's hard to know for sure. Wouldn't you promote interactions you could profitably train on?
So again, allocating resources to Erdos one way or another is just a clearly smart business decision for something people are talking about and which has become an unofficial competition among vendors, not a big scandalous accusation. Something like a reasoning-trace is the only way to settle it. This isn't conspiracy or nitpicking because this is the topic itself: the AI usage is more a matter of public interest than the actual problem solution. What's the argument against more transparency?
The implication is: we're ready to let everyone go wild with this very soon. Ok, go wild with what exactly? How do we know that influential VIP users who might make very friendly blog posts aren't getting allocated exclusive access to a billion dollars worth of hardware when they ask questions? I mean literally giving a certain group of people temporary privileged access to like 90% of all available compute would be a completely reasonable business decision for OpenAI.
Would a reveal like that change how we think about the result? What if half that amount of cash/compute could enable some completely non-AI approach of numerical brute forcing that settles the question even if it didn't write the paper?
My other question is always whether the latest is purely using giant models or if we're now deeply into harnesses that use MCTS and such. Understandable to keep that a trade secret I guess. But IMHO we should at least get the CoT trace as a proxy for true cost, or else maybe we're just getting played to do the hype for corporate.