Anonymous chatbot that mystified and frustrated experts was OpenAI's latest model.
See full article...
See full article...
It's an open secret that all the GPT-wielding emperors have no clothes - the moat is not a lone AI, it's about being useful to humans, just like the rest of us attempt to do at our dayjobs.This whole charting of performance seems rather nonsensical to me.
Ah, no that'd be Pornhub's AI.It's an open secret that all the GPT-wielding emperors have no clothes - the moat is not a lone AI, it's about being useful to humans, just like the rest of us attempt to do at our dayjobs.
I don't play any competitive sports that use ELO rankings, but if I'm understanding this table correctly, a 50 point gap is more like "the new model was judged as better 7.15% of the time".It seems to me we're already starting to see diminishing returns here. According to this metric, this new model is ~4% better than the previous? I've read elsewhere that it's faster and uses fewer computing resources to achieve its results, so maybe that's where the primary gains lie.
What's wrong with Homework Gimptm? Some random mommy blog rated it the best learning aid of the year.Disturbing photo.
'Being useful to humans' is one of the things I think gets lost in the arguments around the current AI hype. The questions around if they're 'true' AI, if they make stuff up, if they violate copyright, etc might not change the course if they become increasingly useful. And they are increasing usful. Compare now to 5 years ago and tell me the Claude/ChatGPT/Gemini style LLMs are a joke.It's an open secret that all the GPT-wielding emperors have no clothes - the moat is not a lone AI, it's about being useful to humans, just like the rest of us attempt to do at our dayjobs.
If you read it all in Casey Kasem’s voice it’s at least nostalgic.This whole charting of performance seems rather nonsensical to me.
I agree with this.'Being useful to humans' is one of the things I think gets lost in the arguments around the current AI hype. The questions around if they're 'true' AI, if they make stuff up, if they violate copyright, etc might not change the course if they become increasingly useful. And they are increasing usful. Compare now to 5 years ago and tell me the Claude/ChatGPT/Gemini style LLMs are a joke.
There's still a lot of problems, holy energy sink batman, there's a lot of really stupid hype that doesn't match reality, and what this means to workers to name a few. But as tools they are getting better, so we need to pay attention and not completely dismiss it like a bunch have been.
The Greendale Human Being evolved to school-from-home during COVID, and now makes house calls.Gotta say, that might be the weirdest stock image I have seen all year.
Well, you might get to the point where you get two perfectly adequate answers but when you get them side-by-side most people get pretty picky about who explained it quicker and better or who followed the instructions more precisely and consistently finding all the right words does have a certain value of its own. There's plenty of more formal benchmarks to test their capabilities on specific tasks.(...) I do expect we're well into diminishing returns ... of the test rankings. Once the bots are good enough, a large number of people won't throw a hard enough challenge at the bots to see a difference, and will judge more-or-less randomly. (...)
10-15 years? I think we're there now. We're already seeing court cases where people are claiming the evidence is AI generated. Even blood evidence will likely be falsifiable within a couple of years now.This seems likely in the very near future to me (10-15 years) and that's a scary proposition.
It's not so much that these things are not and will never be useful. That's absurd. Tied in to what you're saying, what we're going to see is bean counters and C-suite types that have huffed the hype cycle and are going to deploy unfit technology that's "good enough", and that will become the new baseline. How much of customer service has migrated from IVR to chatbots?There's still a lot of problems, holy energy sink batman, there's a lot of really stupid hype that doesn't match reality, and what this means to workers to name a few. But as tools they are getting better, so we need to pay attention and not completely dismiss it like a bunch have been.
But we do pretty much the same thing for chess playing programs.. What's different for you?This whole charting of performance seems rather nonsensical to me.
4% sure but compounding over what period? And this is an intermediate model. Let's see where gpt5 and Claude 4 are. Maybe we'll see a decline in rate of change. It'll be a big deal either way: have we capped out this core technology and have to look to other cs tech to make the overall system smarter.. The point at which this phase of innovation slows down will be hugely significant for the next few decades (for good or bad on either outcome).It seems to me we're already starting to see diminishing returns here. According to this metric, this new model is ~4% better than the previous? I've read elsewhere that it's faster and uses fewer computing resources to achieve its results, so maybe that's where the primary gains lie.
It will go the other way too: people will only believe what comes from an AI or is confirmed by an AI. Not just any AI - they'll have a favorite brand that they trust. Some will only accept facts endorsed by Bing, some will only believe Google-branded facts.the fear isn't that people will believe things that aren't real....it's that people won't believe anything at all.
More precisely, they won't believe anything at all THAT CONTRADICTS THEIR BELIEFS OR VIEWS. Instead, anything that conflicts with these will be dismissed as "AI" (you see this phenomenon already with "fake news").
Yeah, my reaction was that it looked like something from a very disturbing fetish porn shoot.Gotta say, that might be the weirdest stock image I have seen all year.
Yes, I'd totally assumed it was an AI generated image, and was about to comment on what kind of disturbing prompt had been used when I noticed the Getty Images attributionGotta say, that might be the weirdest stock image I have seen all year.
Chess has objective outcomes. What is the objective test for chatbots?But we do pretty much the same thing for chess playing programs.. What's different for you?
Social engineering. The first chatbot that scams the humans out of enough money to pay for it's own computing costs wins.Chess has objective outcomes. What is the objective test for chatbots?
Your time estimate is 25 years too far out. This is basically the state of discourse since about 2014. AI might help it along, but critical thinking was dead the moment it became political.When that occurs the fear isn't that people will believe things that aren't real....it's that people won't believe anything at all.
More precisely, they won't believe anything at all THAT CONTRADICTS THEIR BELIEFS OR VIEWS. Instead, anything that conflicts with these will be dismissed as "AI" (you see this phenomenon already with "fake news").
...
This seems likely in the very near future to me (10-15 years) and that's a scary proposition.
Criminal courts rely on chain of evidence, not unfakeability. A person says they took the photo or that it came from their video feed. A police officer confirms they picked up the knife at the scene and then it is sealed and tracked through the system, including forensic testing by a person willing to swear that those results are correct to the best of their professional knowledge. Where there is room to muddy the waters is say, CCTV 'evidence' that you were elsewhere when the crime was committed, but a decent prosecutor will be sure to highlight any doubts. YMMV in more authoritarian or corrupt legal systems.10-15 years? I think we're there now. We're already seeing court cases where people are claiming the evidence is AI generated. Even blood evidence will likely be falsifiable within a couple of years now.
It seems to me we're already starting to see diminishing returns here. According to this metric, this new model is ~4% better than the previous? I've read elsewhere that it's faster and uses fewer computing resources to achieve its results, so maybe that's where the primary gains lie.
Your time estimate is 25 years too far out. This is basically the state of discourse since about 2014. AI might help it along, but critical thinking was dead the moment it became political.
The thing is, that's already happened. And not in the last year, or the last few years -it's been the case pretty much since, well, since people....
More precisely, they won't believe anything at all THAT CONTRADICTS THEIR BELIEFS OR VIEWS. Instead, anything that conflicts with these will be dismissed as "AI" (you see this phenomenon already with "fake news").
When each person can choose to believe what they choose - with no "authority" able to validate - then every person has a distinct truth.
...
Kinda makes you realize that you don't need generative models to create engaging stock photos?boy there really is a stock image for almost anything, eh?
OpenAI submitting anonymously to a leaderboard doesn’t feel very open.
No, but AI can come up with them for cheaper than a Getty Images subscription!*Kinda makes you realize that you don't need generative models to create engaging stock photos?