The OpenAI press release and their paper are extremely vague about the actual model used, how many times it was run, how much it cost, etc. It’s not worth arguing about unless and until they release more information.You joking? They are the ones that did it!
The guy I quoted confidently wrote that 'they used a specialised model' - he provided no source and yet you swallowed that whole.
I provide the only source you can trust and you question it.
If you don't believe OpenAI, why even believe that a model did this? Maybe a mathematician did it and they just took credit for it - in that case, this whole article by Ars is based on a press release.
Conspiracy theorists are truly nut cases...
This entire article is about OpenAI using their model to disprove a problem. Any documentation on methodology is THEIR documentation.Because that has been well-documented.
For this, all we have is the word of the company that wants to sell people on their general LLM.
Yes but the one thing they do share is that it is not a specialised model. This is the one claim I'm pushing back against.The OpenAI press release and their paper are extremely vague about the actual model used, how many times it was run, how much it cost, etc. It’s not worth arguing about unless and until they release more information.
The thing is that there are a lot of interesting anti-AI (and anti-Big-AI) positions to take and I hold quite a few of them. These religious fanatics will preempt any such conversation with dull inanities like 'LLMs can't think or reason or understand'.There's a lot of people on this site that hold that AI cannot possibly be useful or good and hold it with the intensity of a religious belief. Therefore anything to the contrary must be a lie.
It's freaking impossible to have any kind of useful/interesting conversation about AI on the internet because you've got LinkedIn bros talking about how it's going to take away everyone's jobs and hi-fiving each other over recreating serfdom and then you've also got people insisting its still 2023 and AI systems cannot produce anything of value and use 5000000000000000 gallons of water every nanosecond.
If you ask me (and I know you won't) the religious fervor is entirely on the other side of that argument.The thing is that there are a lot of interesting anti-AI (and anti-Big-AI) positions to take and I hold quite a few of them. These religious fanatics will preempt any such conversation with dull inanities like 'LLMs can't think or reason or understand'.
I'd dare say that if we were here, philosopher David Hume would argue we ourselves are nothing more than Markov chain generators, stacked on top of each other.We really need to stop anthropomorphizing AI systems in language. They are not smart, cannot reason, cannot think, and cannot conceptualize anything. They are markov chain generators stacked on top of each other until the Amazon burns down entirely.
The real question is if AI recommends pineapple pizza because pineapple pizza is the best pizzaHmm...what kind of glue is best to hold the cheese to pizza? My local place uses urethane but a friend swears by the pizza he get, they use cyanoacrolate. Epoxy though..? Too crunchy.
Whenever this example of a chatbot saying to stick cheese to pizza with glue comes up I wonder if it was drawing on something similar to a TV commercial I remember for a brand of frozen pizza in the '80s that went something like this:Hmm...what kind of glue is best to hold the cheese to pizza? My local place uses urethane but a friend swears by the pizza he get, they use cyanoacrolate. Epoxy though..? Too crunchy.
Economics calls this the law of diminishing marginal return. You can't keep using the same tactic over and over and expect similar gains in efficiency that you saw in the past. The AI industry is speed running this principle. First, they did hoover up the whole of digital human knowledge into their model, but it only got us to 2024-level bots. Models plateaued last year when they couldn't squeeze more from the training phase.And in 6 months my baby will weigh 7.5 billion lbs.
I'm convinced there's value in AI (there sure was a few years ago, when we called it "machine learning" or "computer vision" or any of a dozen other names). I'm just not sure we can count on it continuing to grow in an exponential, or even a linear, fashion forever.
It's already consumed all the written work on the internet and every book published; we can't just magically double our training corpus. You can use the AI to build more data to train on, but I'm skeptical that that works as well, or that you can keep doing that forever.
Getting that "last 10%" has a funny habit of consuming exponentially more effort than the previous 90%. I'm not sure existing techniques can get us that next little bit, and I'm even less sure that there's a business model that will support continuing to chase improvements forever.
Straw man. I'm sure the current crop of LLM's are not conscious because I (broadly) know how they work, and nothing about that process seems like consciousness is likely to arise from it. LLM's are merely sophisticated data retrieval systems that do nothing until prompted by users, just like many other, simpler sytems. If I am wrong, then I'd start wondering if querying a SQL database, decoding a jpeg file, or even recalling bits from storage also create and destroy consciousness.Yeah, there are some deep psychological issues going on there, especially with the second group.
That second group seems certain about exactly what human consciousness is, because they’re so sure there’s no way AI could ever be conscious. I find that hilarious, since no one really knows where our thoughts come from.. and I don’t mean the shallow answer, “the Brain.”
For the last time, that stupid example comes from google using an AI to summarize top results, which included a parody/comedy article talking about using glue on their pizza. There are plenty of examples of AI saying stupid shit, this is not one of them, ask even the dumbest local AI, it won't tell you to use glue.Whenever this example of a chatbot saying to stick cheese to pizza with glue comes up I wonder if it was drawing on something similar to a TV commercial I remember for a brand of frozen pizza in the '80s that went something like this:
"Do you like the cheese on that pizza?"
"Yep."
"What if I told you it wasn't cheese?"
"Not cheese?"
"It's casein, the main ingredient in some glues."
"GLUE?!?"
Of course casein is also the main ingredient in milk, but they didn't mention that in the ad. Anyway, I could totally see how an LLM with training data that included mentions of frozen pizzas using artificial cheese made from casein and the fact that casein is used in adhesives would connect the two in its output. Because it matches a pattern and, as the AI boosters like to tell us, that's all our human brains are doing too.
The statements "high quality code" and "rooms full of developers" are incompatible. If the latter were true, you do not know for certain that the former is true. If you know for certain that the former is true, the latter is false.It's been ~3 years from the release of ChatGPT to the general public. This technology is just getting started.
Sitting here in front of my box running 10x frontier agents who are supervising another ~30 sub agents running slightly less capable frontier models. For $20/day. Cranking out high quality code at a rate that completely blows my mind. I used to employ rooms full of developers for hundreds of thousands of dollars per month to do 1/100th as much ( or less ).
The world is changing around us in profound and exciting ways. If you haven't taken the time to load a frontier model and work with it to solve ---- even a hobby problem ----- then you risk missing out.
My kid built a robot that plays chess ( robot arm moving the pieces ) using a chat interface and without writing a single line of code. Not one. She built a motion control system, computer vision stack, data collection work flow, training runbook, trained it, and is running local inference......and her interface is the Signal App. Her freaking agent throws emojis. It is bizzarro and mind bending.
This article makes it clear that the intelligence will go deeper than that. Probably much deeper.
We tend to overestimate near term disruption and under estimate long term disruption. If AI is doing what it is doing today - less than 3 years from going mainstream - I can only imagine what 30 years will bring us.
My personal suspicion on this is LLMs are able to make good headway in math because:Is there an article somewhere (or room for Ars) on why LLMs now are so much better at maths, while they could not figure out anything before? An "old" trick used to be to ask it to generate code and run it, but I'm somewhat sure this is not what happens behind the scenes when you just ask - but it still gets the answers (more) right. Is this just additional training on specific sources, or is there some "wrapper" layer that decides if a prompt calls for maths and hands it over to a more specialised system?
Curious about this as some time ago the whole point was that these sytems can't count because they just predict text. But how is that prediction mechanism now so much better if there are no fundamental changes to how the models are trained and work?
Check the last paragraph:The OpenAI press release and their paper are extremely vague about the actual model used, how many times it was run, how much it cost, etc. It’s not worth arguing about unless and until they release more information.
I've never bothered to read anything in-depth about that particular failmeme, but I hardly see how an explanation in which perhaps the biggest information processing company on the planet would choose to put something as under-baked as what you describe on the webpage that is most central to their business model excuses it. Get annoyed at me for not knowing the details of how it happened all you want, but if anything what you said actually makes it worse.For the last time, that stupid example comes from google using an AI to summarize top results, which included a parody/comedy article talking about using glue on their pizza. There are plenty of examples of AI saying stupid shit, this is not one of them, ask even the dumbest local AI, it won't tell you to use glue.
Don't we already know that there are math problems to which it is impossible to directly compute a solution (i.e. problems which are undecidable)? Do I need to adjust my understanding of decidability, or are the LLM salesmen just obfuscating things with handwaving and "do it in an AI model"?So AI companies have been working to develop LLM systems that can directly output a correct solution to any math problem.
Here's my take, as someone who's been enmeshed in software since the early 90s and could be considered a "naysayer" on the current batch of AI tools: we're not so much naysayers as we're cautious. We've seen this playbook so many times and know that it often doesn't end well in the short term, even if there are good long term benefits.I'd say it's mostly a few middle aged people applying poorly understood philosophical and cognitive science concepts they learned in college in an attempt to satisfy their priors.
It's a bad look for our generation ('81 here) and it's a far cry from what this site used to represent (literally in the name). Damn shame... but it proves we need to get rid of the old people. I just thank my stars I haven't become one of the naysayers... yet.
We do. And we also know that there are a lot of problems that do have a computable answer.Don't we already know that there are math problems to which it is impossible to directly compute a solution (i.e. problems which are undecidable)? Do I need to adjust my understanding of decidability, or are the LLM salesmen just obfuscating things with handwaving and "do it in an AI model"?
This is some good discussion, ty.Economics calls this the law of diminishing marginal return. You can't keep using the same tactic over and over and expect similar gains in efficiency that you saw in the past. The AI industry is speed running this principle. First, they did hoover up the whole of digital human knowledge into their model, but it only got us to 2024-level bots. Models plateaued last year when they couldn't squeeze more from the training phase.
Then they added more agents and more context windows, increasing the compute spent per query. That got them to where they're at today: modestly better than last year, but at increased cost.
Extracting further improvements from the existing paradigm is going to get exponentially more expensive if they can't figure out a new avenue. They've already reached diminishing returns on training and recall phases of the process. I'm not sure where they're going to find more juice to squeeze.
Even if they can squeeze more efficiency from this tech stack, it's just going to be the same transformer technology, but better. It might help solve problems like this, that involve drawing associations across related (but known) fields of study by finding correlations humans haven't yet. It's still just making inferences. It's still backwards looking, unable to apply deductive logic to make forward looking predictions.
Straw man. I'm sure the current crop of LLM's are not conscious because I (broadly) know how they work, and nothing about that process seems like consciousness is likely to arise from it. LLM's are merely sophisticated data retrieval systems that do nothing until prompted by users, just like many other, simpler sytems. If I am wrong, then I'd start wondering if querying a SQL database, decoding a jpeg file, or even recalling bits from storage also create and destroy consciousness.
That is fodder for some horror fiction, but I don't find it plausible.
Well..there's the problem with AI then! Not familiar with any joints that use good old fashioned Elmer's, that might be an option...Yanno, I don't think the LLM specified which glue to use.
My suspicion is the "any" is doing a lot of lifting there and is marketing-speak.Don't we already know that there are math problems to which it is impossible to directly compute a solution (i.e. problems which are undecidable)? Do I need to adjust my understanding of decidability, or are the LLM salesmen just obfuscating things with handwaving and "do it in an AI model"?
It was not a publicly available model. It was also not a math-specific model or a specialized harness (which Google has been using lately to prove lesser erdos problems).One question I have: Did OpenAI use one of the publicly-available models? Or is this an internal model? I figure the toolchain around it is custom, the way any of us would make one with the API, but I was wondering if the API calls themselves were to the standard LLM model they make available to customers, or if it's some kind of supercharged model with extra resources used for research problems like this.
Several years ago Google’s A.I. solved a major problem in biology with the problem of generalized protein folding. That was something that had been seemingly impossible to solve. I’m certain that we will see more of that happening.It makes a lot of sense to me that even "non thinking" ML of whatever flavour could find a lot of previously unnoticed connections between the tree branches of knowledge - be this in Maths Chemistry or Biology; as the article notes finding these connections can take rare overlapping knowledge areas. Once you have the compute ability you can (if interested) just throw more compute at randomish directions until one of them returns something interesting.
Whilst genuinely new advances might come earlier in Maths, I think new advances in Chemistry, Biology and similar are going to be slower because of the lab requirements. Well, if and until those go dark too - and at that point we're possibly getting into weird "magic" technology.
IMO, the biggest gains were on or before 2023. It's why GPT (released late 22) was such a shock to the public. People shat on "Will Smith Eating Spaghetti" (early 2023), but it demonstrated the basic techniques used in subsequent image models, just at greater scale.This is some good discussion, ty.
I 100% agree that I think we are in diminishing returns land simply because if you look at the gains from 2023-present its unbelievable how far things have come. There's simply not a lot left to squeeze out because in many applications where we're at now is "good enough".
The point on logic is an interesting one, I'm curious if there is anything in the literature about trying to build something that works on a more deductive basis. Might be interesting.
I've said it before and I'll say it again, modern AIs produce content that looks a lot more like "reasoning" than I see many humans come up with.We really need to stop anthropomorphizing AI systems in language. They are not smart, cannot reason, cannot think, and cannot conceptualize anything. They are markov chain generators stacked on top of each other until the Amazon burns down entirely.
It is so tiring reading these dumb comments where commentators always accuse some imaginary group of something, then continue talking about that imaginary group.
(...)
replacing secretaries, making paintings, configuring software or inventing cooking recipes which is what AI psychopaths are trying to push onto everyone, because they can't monetize a tool that destroys the planet and is useful only in scientific environment.
What source did you provide? All I see is that this was a general purpose model trained for this task. Many people believe that represents the best use of this technology: bespoke models specifically and exhaustively trained on curated data to perform well-considered and defined tasks.You joking? They are the ones that did it!
The guy I quoted confidently wrote that 'they used a specialised model' - he provided no source and yet you swallowed that whole.
I provide the only source you can trust and you question it.
If you don't believe OpenAI, why even believe that a model did this? Maybe a mathematician did it and they just took credit for it - in that case, this whole article by Ars is based on a press release.
Conspiracy theorists are truly nut cases...
You didn't look up the thread?What source did you provide? All I see is that this was a general purpose model trained for this task. Many people believe that represents the best use of this technology: bespoke models specifically and exhaustively trained on curated data to perform well-considered and defined tasks.
Its also important to note the model in question is not ChatGPT (what 99% of OpenAI users will have access to) — it's a specialized math model that was trained solely on a corpus of university math texts and validated papers are floating around out there.
It is important to note that a large part of the reason that a mathematical LLM like this or a protein LLM or any other science-based LLM works is because the data set has been scrupulously cleaned and QCd. For example, if someone had slipped π=3 into the training data set, the output would have had quite a few errors in it.
In contrast, the average LLM is trained on all sorts of nonsensical data (see: the internet) and so the LLM outputs all sorts of nonsense (GIGO, as we used to say back in the day when we carved the symbols by hand on clay tablets).
"The proof came from a new general-purpose reasoning model, rather than from a system trained specifically for mathematics, scaffolded to search through proof strategies, or targeted at the unit distance problem in particular"
https://openai.com/index/model-disproves-discrete-geometry-conjecture/
Ironic you follow up "so confidently too" by quoting what amounts to a press release.Where do you guys get your info from? You spout such misinformation so confidently too...
"The proof came from a new general-purpose reasoning model, rather than from a system trained specifically for mathematics, scaffolded to search through proof strategies, or targeted at the unit distance problem in particular"
https://openai.com/index/model-disproves-discrete-geometry-conjecture/
Agreed. The reaction to, in essence, a million monkeys smashing the keys on typewriters 24/7 putting out a paragraph from Romeo and Juliet word for word is a bit over the top imo, especially considering what it is costing us to have those monkeys typing away 24/7.The model (like all LLM models) basically ran many trial-and-errors and came up with a viable solution based on related things it had trained on, and patterns it had found. This is fine in the sense that, if an LLM can run a problem through 100s or 1000s of iterations that would take a many years off a human math guru's life, then by all means use the tool for that.
But in the end this does sound a bit like "throw a bunch of solutions against the wall (without really understanding what it's doing) and see what stuck," then the humans can clean it up into some type of theorem (not sure if that's the right word here but one gets the idea).
No one is asking you to take the word of OpenAI.Ironic you follow up "so confidently too" by quoting what amounts to a press release.
Are you familiar with the concepts of marketing and promotional language? OpenAI is under no obligation to accurately describe what this model was, and it certainly benefits their image to say "nope, this was just a generalist model, no special training at all", right??
Considering people can trip up ChatGPT with high school / 101 level college math problems (basic errors that bear out the LLMs don't "calculate" based on rules, but infer answers based on patterns). So.... YEAH. I'll err on the side of not taking the word of a company who has demonstrated multiple ethical lapses (and a founder who is known to be a manipulative twat in his speeches and press interactions), and go with the much more likely scenario, based on how how LLMs work and how they tend to be trained in specialized fields, that this model was training solely or mostly on math content. How else would it identify the necessary patterns to arrive at a correct solution?
![]()
Its also important to note the model in question is not ChatGPT (what 99% of OpenAI users will have access to) — it's a specialized math model that was trained solely on a corpus of university math texts and validated papers are floating around out there.
I strongly suspect there is a reason why it sounds like spam marketing fluff.On the one hand, this sounds a lot like the old spam post you used to see on poorly-moderated fora along the lines of "my friend quit her job to work from home, and now the makes over $5,000 a month using this one simple trick!"
Sounds like you understood more than me. What I don't understand, in addition to not understanding the article OR the stuff the article was about, is why the original problem was interesting to the mathematicians. I assume that in some future date some guy will yell Eureka and use this stuff make or unmake something like the field of prime numbers being used in crypto or somesuch. At any rate, I still learned a bit, which is generally all I can ask. Thanks.OK, I'm getting a little worried now. AI was able to generate an answer to the problem, and I could barely understand the article.
I was finding interesting the dude named after Lenin's confidence that he was among the enlightened and therefore suitable to cast judgment on the perspectives of others on how society's resources are best employed.Dude named after Lenin posts about his desire to "get rid of people". If you're not trolling, then I'd strongly suggest some introspection.