Has Gemini surpassed ChatGPT? We put the AI models to the test.

Purple Gryphon · Jan 22, 2026

True Neutral said:
I believe there is a flaw in your tests. A friend and I had a pretty negative experience with Gemini when we asked the exact same question on 2 different phones and got 2 different answers. I did not see a test for this in the article.

I had an iphone and my friend had an android phone. We both asked Gemini if Frontier Airlines flew to a certain destination.

One of us got a "yes" and the other a "no". In the end, we had to just manually go to Frontier's web site to look for ourselves.

But this was telling as it suggested to us that Gemini, probably not knowing the answer (or where to get it from) just "flipped a coin" to say yes or no to us. Otherwise, why would we not have got the same answer?

I think this type of situation should be tested for with all AI programs. Ask the same question on multiple devices and see if the answer suddenly changes.

It will, because LLMs aren’t actually getting you the right answer. They don’t know what that is.

They’re just giving you a sentence of words which in similar contexts are likely to come after the other.

Unless word B comes after word A 100% of the time, you will sometimes get word C there instead. How often that happens is something that can be adjusted on the back end, but if it’s too restrictive people don’t like the repetitive robotic responses.

And the only thing these companies care about is making people want to use their crap. Accuracy only matters inasmuch as they don’t want the frustration of incorrect answers to outweigh the dopamine from having a flowery sycophant tell you that you’re right.

Madestjohn · Jan 22, 2026

Purple Gryphon said:
It will, because LLMs aren’t actually getting you the right answer. They don’t know what that is.

They’re just giving you a sentence of words which in similar contexts are likely to come after the other.

Unless word B comes after word A 100% of the time, you will sometimes get word C there instead. How often that happens is something that can be adjusted on the back end, but if it’s too restrictive people don’t like the repetitive robotic responses.

And the only thing these companies care about is making people want to use their crap. Accuracy only matters inasmuch as they don’t want the frustration of incorrect answers to outweigh the dopamine from having a flowery sycophant tell you that you’re right.

This ..
it isn’t giving you an answer .. its simulating what an answer sounds like
And its gotten quite good at that
But if its Right or wrong isn’t the point and is largely irrelevant

omniron · Jan 23, 2026

Purple Gryphon said:
It will, because LLMs aren’t actually getting you the right answer. They don’t know what that is.

They’re just giving you a sentence of words which in similar contexts are likely to come after the other.

Unless word B comes after word A 100% of the time, you will sometimes get word C there instead. How often that happens is something that can be adjusted on the back end, but if it’s too restrictive people don’t like the repetitive robotic responses.

And the only thing these companies care about is making people want to use their crap. Accuracy only matters inasmuch as they don’t want the frustration of incorrect answers to outweigh the dopamine from having a flowery sycophant tell you that you’re right.

Modern LLMs have tool usage capability to find the right answer. Whether the ui you’re using exposes this is a different question. So increasingly they are giving you as best an answer as data is available for.

Hagen Stein · Jan 23, 2026

macduff said:
This is why I think Google will win the AI wars. They don't have to be the best, they just have to be about as good as the others. But where the other LLM providers are entirely dependent on revenue from their AI bot, AI is just one of many different revenue streams for Google. Google seems to be the best one positioned to survive the eventual AI bubble popping.

There are more reasons why I share this assumption:

Google is producing its own AI chips
Google has its own data center infrastructure
Google has a host of products that it can integrate Gemini into (and learn from that what works and what not)
Presumably Google has the most training data from their decades of web crawling and projects like Google Scholar and Google Books. Though of course there's the copyright issue with the latter two. But AI companies seem to care less about that.

For 1. + 2. OpenAI (and others) have to pay big bucks for without having any profitable business model yet. And 3. helps preventing developing solutions that in practice don't work or users do not want/accept/use.

kaleberg · Jan 23, 2026

needSomeCoffee said:
I almost always use Gemini Thinking to improve my emails. Do not rely on it to write the emails. However using its suggestions to improve my drafts works extremely well. Notably when addressing someone with an extensive background in something like psychology, one can start a thread asking Gemini to familiarize itself with their publications. Then when doing drafts Gemini will (for me at least) be really helpful in pointing out how a draft can be improved by referencing this. Again, I do not rely on Gemini to write the emails. It tends to write pretty long ones on its own. But for help editing and word-smithing my emails... Godsend. HTH, NSC

We have a neighbor who travels a lot and likes to make narrated travelogues. Unfortunately, his voice has been going. He found an online AI voice processing system and fed it examples of his voice before his problems started, so now his narration is in his voice without the recent flaws. It's like your solution to write the email and then use it improve your writing.

Maton · Jan 24, 2026

I had a peak inside the data center that runs Gemini and there was a captured alien sedated and hooked up to computers.

rbryanh · Jan 24, 2026

Who was the better architect, Albert Speer or Hermann Giesler?

It's a question so stupid it's evil. When the point is to reduce human suffering (and when is it not?), debating the merits of nicotine vs. asbestos only proves that past a certain point, idiocy becomes indistinguishable from malice. Those who believe absurdities inevitably commit atrocities.

buzzword · Jan 24, 2026

Every time I've done a search on how to do something, regardless of AI used, I've been offered instructions that are wrong, or told to adjust settings that don't exist. The future is truly terrifying.

that guy strife · Jan 25, 2026

After reading this article, I started using Gemini after having used GPT for about 8 months. In just a few days, I've dropped GPT alltogether. Gemini gives more info, yet more condensed (4 paragraphs for Gemini, 10 for GPT).

It's just ... smarter. I tried configuring a flight game with both. Gemini recognized the option I had selected from the screenshot (Expert controls) and tailored from there. It also suggested using the free flight mode to practice, and offered to teach more advanced maneuvers once that was done. GPT didn't (with the same screenshot), and just talked about the difference between the ''Arcade'' and ''Expert'' controls.

When the game wouldn't show 1440p in the resolutions - Gemini first wrote ''yeah, known bug with that game. Switch to fullscreen, restart the game, should be fixed.'' And it was. Meanwhile GPT just gave basic advice about Windows menus like ''make sure your display is set to 1440p''.

I asked both ''tips for getting started with blender ? I will eventually use those models in Unreal Engine 5''. Gemini first mentionned an industry standard (the donut), then a whole bunch of specs, then 3 plugins, then a ''Blender to UE masterclass'' video from Youtube.

GPT only mentionned the same specs.

I heard about an app for movie pees - and asked both about it. Gemini described 6 of the apps feature, who it came from, and even an actress that had said she loves using it. GPT described 2 features.

Oetkb · Jan 28, 2026

VelvetRemedy said:
Thou shalt not make a machine in the likeness of a human mind.

Wow, I'd forgotten that one. Time to remember.

JohnMeredith · Feb 2, 2026

iquanyin said:
what would introspection even mean for AI? i'm unclear what you mean by it in this context.

The ability to treat its own cognitive processes as a subject of symbolic analysis. The bit that Hofstadter of "Godel, Escher, Bach" fame would call a strange loop, if you want to get philosophical.

Trivial example: a system monitor application checking the CPU temperature. Non-trivial example: an LLM being able to reliably identify whether an answer it previously gave was most attributable to training/tuning data, prompting/RAG, or creativity/hallucination.

Self-awareness, basically, but in the low procedural sense of "hypocrites frequently have poor self-awareness" rather than the high magical-consciousness sense of "Skynet became self-aware on 2:14 a.m. EDT on August 29, 1997". Calling it introspection is slightly less vulnerable to creative misunderstanding, if only because people will actually ask what it means here rather than jumping to conclusions

jerminator · Feb 11, 2026

huskcummerbund said:
Didn't read Dune, I guess.

Nope, tried to but it was too much Lawrence of Arabia in space for me

huskcummerbund · Feb 11, 2026

jerminator said:
Nope, tried to but it was too much Lawrence of Arabia in space for me

Ah, probably my favorite book. But, yeah, that's what the commenter was referencing.

Has Gemini surpassed ChatGPT? We put the AI models to the test.

Smack-Fu Master, in training

Ars Tribunus Angusticlavius

Ars Praefectus

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Centurion

Account Banned

Ars Centurion

Ars Praetorian

Wise, Aged Ars Veteran

Seniorius Lurkius

Ars Centurion

Ars Centurion