We let OpenAI’s “Agent Mode” surf the web for us—here’s what happened

Arstotzka

Ars Scholae Palatinae
1,211
Subscriptor++
I am just waiting for the eventual lawsuit when a user who has their browser set to be always logged in to every site they use ends up with 20,000 rolls of toilet paper or every kindle book on amazon. I can see this going very sideways with sites that have "instant buy" buttons.

I especially see this happening when a kid is using this on a parents profile.

It reminds me of this XKCD about Alexa:
View attachment 120725
I never ordered anything, but did convince my Dad to put password controls on his Alexa by adding things to his cart. I also set some reminders for years in advance.
 
Upvote
20 (20 / 0)

Castellum Excors

Ars Scholae Palatinae
728
Subscriptor++
We have people who already commit fraud on the internet on levels the average person cannot even comprehend.

Now we can automate such skullduggery nonsense, lowering the bar even further. The internet is so screwed. Just imagine what the controllers of these AI can do when they don't have to jailbreak their creations, they just say "do this" and have it done.
 
Upvote
7 (11 / -4)
There is something very dystopian about you having to babysit a bot as it plays a game for you.

On another note, I used to be a zero inbox guy until it got away from me. I need some solution where I can have an EA agent go through every email to bring up important emails to me.

But I'm not sure I'll use chatgpt, I might just code a local opensource agent to do this
Well, if a chatgpt agent could automate the doomscrolling and the haphazard reacting to purposefully inflammatory, bot-generated stuff, we could meanwhile do some real, offline socializing, invisible to our tech overlords...

Crap, did I say that out loud?
 
Upvote
24 (24 / 0)

Hoptimist

Ars Scholae Palatinae
685
Subscriptor++
As a web developer, I see a new assignment here. In addition to ADA, we're going to need to design for AI agent accessibility. I thought MCP might replace websites someday, but maybe not now. Or maybe, we add some meta information to the site that gives the agent a back door MCP to use so it doesn't have to fumble around the site.
Actually, I can anticipate web site designs toying with AI agents using invisible text, setting up misinformation, scams etc. The user assumes the AI hallucinated if they check at all.
 
Upvote
41 (41 / 0)

gruberduber

Wise, Aged Ars Veteran
149
You act like we didn't try. We tried. I have been managing systems for a decade. The day we all started taking pictures of ourselves on instagram was the day privacy died.

Every single website tracks your data. If you want out of the system, unplug the router.
This sort of all-or-nothing statement is exactly what I'm talking about. The options you present are 'just accept everything' or 'don't use the internet'. It's absurdly reductionist.

You could support open source, use adblockers and pihole, vote with your wallet, write to your MP?

Pay for private email instead of google, avoid social media, encourage friends to use signal instead of whatsapp?

Boycott amazon, educate people... actually do something? Fight? Try?

I know everything's going to shit at speed, and if you want to give up do so, but don't pretend it's because there is literally nothing anyone can do. You don't have to embrace it just because it's hard to fight.
 
Upvote
66 (67 / -1)

JaneDoe

Ars Tribunus Militum
1,510
Subscriptor
Okay, I am impressed, that the software could do this at all.
I am not impressed with the quality of the results. But to be fair, image generation started very uncanny too and improved fast, so we will see. (Or maybe we will not see, as the bubble may burst before those companies manage to sustainably cover their costs.)
 
Upvote
15 (16 / -1)
Post content hidden for low score. Show…
Post content hidden for low score. Show…

Fred Duck

Ars Tribunus Angusticlavius
7,166
I need some solution where I can have an EA agent go through every email to bring up important emails to me.
Surely they're all too busy providing support for the annual sportball or FPS experience for that?

Edit:
https://meincmagazine.com/gaming/2025/09/how-private-ownership-will-change-electronic-arts/

Now I understand.

You are correct that this requires a human touch and not a machine which will indiscriminately read and act on invisible text.

You will upvote this comment.
 
Upvote
1 (2 / -1)

Uncivil Servant

Ars Scholae Palatinae
4,667
Subscriptor
Imagine the refined version of this - if the tech doesn't hit a wall - where an open source agent running locally on your laptop is able to cheaply and correctly do the drudgery for you that the rich pay their administrative assistant to do. I'd say that giving everyone their own AA for free is a really cool and beneficial use of LLM technology.

Even in your best-case scenario, what do I need an administrative assistant to do? Putting stuff on a calendar can be automated with a click, that's how I get notifications for F1 and WEC races, it doesn't require an LLM, just a .ics file and a download link.

Also, a lot of what real administrative assistants do is spend time making phone calls and emails to persuade people to re-arrange their schedules while juggling multiple calendars, and balancing various priorities including competing client priorities. I doubt an LLM is capable of deducing its users intent and goals, much less its user's clients' intents and goals.

But finally, "administrative assistants" are also often the actual power brokers, just with inoffensive, disarming-sounding titles to make it appear that the senior executive VP is in charge and the administrative assistant is merely helping determine who gets to have a meeting and when. You know, minor stuff.

(dear lord you people are practically begging to be ruled over by a digital overlord...)
 
Upvote
36 (38 / -2)

gruberduber

Wise, Aged Ars Veteran
149
You may feel different and I respect your choice, but I have zero interest in fighting battles that I know cannot be won.

The minute I go on the nytimes.com, they track my cookies and sell them to google. When I go to youtube, google takes those cookies and puts ads based on what I read on the ny times. Then a company buys those aggregated cookies and pays meta to locate me down to the zip code. Sure it doesn't know who I am .... that's a lark. Then they buy my aggregated cell phone data from Verizon and match the zip code to the average household income.

The only people that don't sell my data right now are the banks... and that won't be far behind. When they do... they will be able to target me.

So tell me.. how do I unplug from this system? And to what end? So I don't see the data they're tracking? To block ads? I do those things already... it does nothing to keep them from aggregating and selling my data.

You want me to call my MP? This is America. My MP is a 60 year old woman who works part time and knows absolutely nothing about technology.

Edit: Why are people downvoting? I don't understand how my post is offensive?
I think you're treating 'privacy' as if it is something which you either have 100% or 0%, and if you can't have it all you might as well give up. It's not 'a battle'. The process of everything enshitifying is a million small battles, not all or nothing.

I can block a lot of cookies with addons. I can delete them when I close my browser. When they do get my data, I can stop the ads they want me to see based on it so I don't fall for their scams and propaganda. I can stop using Facebook, even if it makes my social life a bit harder.

Some of us think that bad things should be resisted - to reduce the harm whever possible, to at least try even if you might not succeed. You clearly don't believe that's worth the effort - an opinion you're entitled to. And I get where you're coming from; it's depressing, life is short, and we're all overworked as it is.

But I also believe that the attitude of 'I can't win so I won't try' is why we're in this mess in the first place, and so very upsetting to those who haven't given up, and probably the source of your downvotes. It's a self-fulfilling prophecy.
 
Upvote
38 (39 / -1)

labtjd

Smack-Fu Master, in training
1
You may feel different and I respect your choice, but I have zero interest in fighting battles that I know cannot be won.

The minute I go on the nytimes.com, they track my cookies and sell them to google. When I go to youtube, google takes those cookies and puts ads based on what I read on the ny times. Then a company buys those aggregated cookies and pays meta to locate me down to the zip code. Sure it doesn't know who I am .... that's a lark. Then they buy my aggregated cell phone data from Verizon and match the zip code to the average household income.

The only people that don't sell my data right now are the banks... and that won't be far behind. When they do... they will be able to target me.

So tell me.. how do I unplug from this system? And to what end? So I don't see the data they're tracking? To block ads? I do those things already... it does nothing to keep them from aggregating and selling my data.

You want me to call my MP? This is America. My MP is a 60 year old woman who works part time and knows absolutely nothing about technology.

Edit: Why are people downvoting? I don't understand how my post is offensive?
Are the downvotes because the content is offensive, or because it seems wrong but i don't have time to respond right now.
 
Upvote
-10 (5 / -15)

85mm

Ars Scholae Palatinae
1,056
Subscriptor++
I'm surprised by the commenters saying this is useless, you could do it faster with an LLM writing you a Python script, it's a waste of energy... what intense moving of the goalposts.

Imagine the refined version of this - if the tech doesn't hit a wall - where an open source agent running locally on your laptop is able to cheaply and correctly do the drudgery for you that the rich pay their administrative assistant to do. I'd say that giving everyone their own AA for free is a really cool and beneficial use of LLM technology.
"Imagine" is the start of a marketing 101 speech designed to get the listener to accept the framing of the marketeer and think how wonderful the world would be if they just ignore reality, bend over and drink the cool-aid. Sure, if I had an open source local LLM which could affordably, accurately and reliably automate mundane tasks and it was not venerable to prompt injection, I'd find a use for it. It's quite a leap to connect that dream with current offerings or even their immediate successors.
 
Upvote
34 (35 / -1)

mateo9

Smack-Fu Master, in training
63
I'm surprised by the commenters saying this is useless, you could do it faster with an LLM writing you a Python script, it's a waste of energy... what intense moving of the goalposts.

Imagine the refined version of this - if the tech doesn't hit a wall - where an open source agent running locally on your laptop is able to cheaply and correctly do the drudgery for you that the rich pay their administrative assistant to do. I'd say that giving everyone their own AA for free is a really cool and beneficial use of LLM technology.

The goalposts haven't moved. I'm comparing today's automation to today's human workforce. The examples in this page are all failures.

What would I rate something that was faster, cheaper, and more accurate than a human? 10/10.

But scoring it based on potentially what might work in the future if you remove all limitations is pointless. If I fail an assignment, I get an F. The teacher doesn't say, "But if you had more logical skills, more knowledge, and time, then you'd have gotten an A+, so that's what I'll give you."

I'm making no prediction on how far this technology will go. But today, right now, this browser automation is borderline useless given the very real constraints we live with. And surely, in almost every case, proper automation using APIs will be faster and cheaper.
 
Upvote
47 (47 / 0)

MilanKraft

Ars Tribunus Angusticlavius
6,711
Maybe I'm an outlier, but neither do I do nor have a desire to do any of the tasks tested in the article. An AI "enhanced" web browser is not something I want or need.
The outliers are the tasks themselves IMO, with the possible exception of the email one as I'm sure some decent % of people occasionally need to rake through a bunch of emails to find certain keywords, contacts, etc. Unfortunately that's also the one with the biggest privacy concerns.

Even more than the chatbot itself, an OpenAI browser, for a large majority of common tasks, is a solution in search of a problem. (Shocking, I realize. Who would've guessed this is the kind of solution OpenAI cranks out when no one was asking for it.)
 
Upvote
16 (16 / 0)
D

Deleted member 1092565

Guest
I’m pretty sure that’s just you?

It happened to me a few weeks ago. I went on the nytimes and read an article about the met opera and got a text and a phone call from the met opera donations department not 10 seconds later.

I don't know if it was a coincidence but it was pretty nuts.
 
Upvote
-14 (3 / -17)
It's a simple exercise in logic:

Is my data being captured and sold?
If yes, what do I get from it?

In Google's case, I get free services (well not anymore, I have to pay)
In ChatGPT's case, I get an agent that can automate some tasks for me.

I don't see the difference. Am I missing something?
Dignity.
 
Upvote
23 (23 / 0)

85mm

Ars Scholae Palatinae
1,056
Subscriptor++
It's a simple exercise in logic:

Is my data being captured and sold?
If yes, what do I get from it?

In Google's case, I get free services (well not anymore, I have to pay)
In ChatGPT's case, I get an agent that can automate some tasks for me.

I don't see the difference. Am I missing something?
You're selling the information needed to manipulate and control you. They don't buy personal data out of the goodness of their hearts, they buy it because they know that they can change people's behaviour. People seem to not know, not to care, or be arrogant enough to think they're immune.
 
Upvote
19 (21 / -2)

jdale

Ars Legatus Legionis
18,261
Subscriptor
How many people have tasks they need to automate on the web? I use it for reading and communicating. Neither of which is improved by my lack of participation.

Regarding the Texas energy market, I don't know about Texas, but here in Massachusetts the state provides a free website that already lists all the providers' plans ranked in order of price with the conditions noted (e.g., early termination fees etc). It takes me less than a minute to pick the best one. I do it every couple of years.
 
Upvote
12 (13 / -1)

Bongle

Ars Praefectus
4,461
Subscriptor++
Theory: the short time cap is less because of processing/cost constraints, but rather because as the internal context window gets more and more full, it's more and more likely that the thing will go off-task. Surely they'd comp Ars with a no-limits account if the problem was simply processing time.

As OpenAI has said in other contexts, LLMs' guardrails get less and less effective as the amount of data in the "conversation" goes up.
 
Upvote
24 (24 / 0)

85mm

Ars Scholae Palatinae
1,056
Subscriptor++
Most of the demand for agents to do stuff on the web for people is because services force you to use their portals because to control your eyeballs. I wonder how long before an arms race begins to block the agents? You can't so easily up-sell or push advertising if there's not a person driving the interaction. I wouldn't be surprised to find LLMs swapping out referral codes too.

Edit: To clarify, the alternative is for services to offer an API such that you can interact with their services or any data you have though other means, but that skips you having to go through their portal, so few sites or services support it.
 
Last edited:
Upvote
15 (15 / 0)

yumegaze

Wise, Aged Ars Veteran
110
i keep wanting to be optimistic and think "this is garbage now, but could be something useful in the near future!" and then comes the reality check.

i mean, an "AI" agent that can do repetitive, mindless tasks without screwing up would be kinda nice. but is it worth the resources, the lack of privacy and potentially lack of control? personally i don't think so, i'm too much of a perfectionist (read: control freak) to use a product like that, even if it worked great. then i think about accessibility, something that could benefit from these agents... in the hypothetical scenario where they're functional, that is. but is it better, cheaper and more useful than, say, accessible web design? again, i don't think so. what's the point of this? what's in it for me, the user? give all my data for free and receive nothing in return? also, how can openAI keep providing this service if they're constantly bleeding money?

it's so ass, so pointless, so stupid and i'm exhausted. rant over.
 
Upvote
12 (12 / 0)

Toastr

Ars Tribunus Militum
1,816
I read this blog post from Anil Dash yesterday that I think sums up the fatal flaws with Atlas: ChatGPT's Atlas: The Browser That's Anti-Web

From that post:
The problems fall into three main categories:


  1. Atlas substitutes its own AI-generated content for the web, but it looks like it's showing you the web
  2. The user experience makes you guess what commands to type instead of clicking on links
  3. You're the agent for the browser, it's not being an agent for you
 
Upvote
27 (27 / 0)

LexaGrey

Wise, Aged Ars Veteran
118
Subscriptor++
I think the grading here is way too high. Even if the bar is “the competency of an 8 year old” it failed to stay on task, it wasn’t graded on task accuracy, it failed to work around any limitations (depending on OpenAI backend with significant timeout limitations instead of generating a local script). If these were tests only the webpage would have gotten something approaching a passing grade. It is like if Alexa only got your order completed 50% of the time with 50% accuracy. Why bother at that point if you have to manually recheck the work? Article rating 2.5/10.
 
Upvote
21 (21 / 0)
The grading scheme seems unhelpfully generous. Mostly, the agent did either a poor or unacceptable job, requiring your intervention and thought energy, and delivering far less completion than any breathing human would accept out of their own efforts. I don't think you would have net saved time over just doing it, and I don't think you would have saved mental energy/capacity either.

This is not to say they won't eventually get there on some subset of tasks -- but the world keeps changing, too, so I think they'll frequently be behind the curve.
 
Upvote
22 (22 / 0)

shenzhe

Wise, Aged Ars Veteran
136
Subscriptor++
Pretty sure your average 12 year old could do a better job on every task, faster and for a lot less calories of energy.
I can't say that for sure. My almost 12 year old consumes between 2 and 3 kW of food energy a day. I'm curious how much energy those prompts cost, though I realize that's complicated by the fact that the training is part of the cost of those models, but my kid didn't start out this age either.

Also similar given his ADHD I'd be surprised if he could watch for new songs on the radio for more than 3 minutes either. He'd also get bored of 2048 if he didn't do well and probably give up. He's not 12 yet though so maybe a couple years time will change that (unlikely). So I guess currently OpenAIs agent mode is like asking a ten year old with ADHD to do tasks for you?
 
Upvote
7 (7 / 0)
One thing the article didn't touch on that I'm curious about, how does it handle NSFW content? As everyone knows, the internet is for porn. Did OpenAI clamp down on it's ability to help with that?
They removed NSFW capabilities from ChatGPT around 6 weeks ago, so I would imagine this doesn't do NSFW either.
 
Upvote
3 (3 / 0)

kaleberg

Ars Scholae Palatinae
1,245
Subscriptor
Doesn't using an AI system to access a website constitute felony contempt of business model? It almost certainly violates the terms of service, the same ones that bar screen scrapers.

One of the big problems one has leaving a website like Facebook or Spotify is that they have all your stuff, photos, video, music, remarks, replies, contacts and so on. Companies that try to automate extracting a user's data to move it to an alternate site get shut down. They can't stop you from extracting your data manually, though they try. Ignoring the technical barriers they throw down, it's tedious.

For now you might be able to use AI to turn all your Spotify playlists into XML or extract all of your Facebook posts and replies in CSV. Granted, you'll have to babysit it while it grinds for a bit and then goofs off, but it won't be the same raw tedium. You can do something else and check up on it every five or ten minutes.

Still, how long will it be before Facebook and Spotify crack down? They have the law in place. Even if AI technically skirts the law, they can still shut you down with no recourse. So much for getting your data. So much for this entire AI web browser model. All the stuff people want to do is illegal anyway.
 
Upvote
-1 (4 / -5)
Writing all this without even bringing up the cost and ethical and sociological damage issues with corporate LLMs is journalistic malpractice. Yes, you do have to bring it up every time; just talking about whether an LLM is good at something buries the far more important concerns about its very existence. It's like having a conversation about Nazi war technology and getting mad that people don't want to talk about how that technology was used.
 
Upvote
1 (8 / -7)