AI costs how much? GitHub Copilot users react to new usage-based pricing system

balthazarr

Ars Tribunus Angusticlavius
6,941
Subscriptor++
We went through something similar at work a few months ago with Cursor. My boss is all in on AI coding. And frankly its becoming indispensable annoyingly quickly. But you cant be naive about usage.

He got the whole team using it and we burned through our monthly allotment in a week. Whoops. Then I learned that keeping an agent around for multiple requests increases the context and hence the cost. And frontier models while better are ten times as expensive.

Im now careful to use cheaper models and spawn new agents frequently to control context. I only use Cursor (usually on auto occasionally on a recent Claude model) for coding tasks. Anything else that doesnt need the context of my whole codebase I use Chat GPT which is a flat monthly fee.

Its slightly annoying but works well for me without breaking the bank. A coworker is more sophisticated. She has three tiers of models. She has a smart model generate the overall plan. After review she has a midtier model break that down into simple tasks that she assigns to fairly stupid agents. She's able to get top tier results without top tier pricing by ensuring she only allows the appropriate agent to do each job. I think thats the future. We need better automation and instrumentation around agent use so we can tune our desired mix of cost vs quality.
Or... instead of jumping through increasingly convoluted hoops to make the supposedly super sophisticated tool kinda sorta work you could - and bear with me here - just write some code?

This has the benefit of producing a far more maintainable codebase, giving you much better knowledge of the code and its workings and... keeping you and your colleagues employed.
 
Upvote
28 (31 / -3)

JoHBE

Ars Praefectus
4,393
Subscriptor++
My gut feeling is that the free local tools are about to suddenly go bye-bye. There’s no way that the industry can afford to keep a) absorbing training costs and b) undercutting their own business models now that the bubble is finally popping.

If China is smart, they will do their utmost best to keep providing open weights that approach frontline models as closely as possible. The ROI for them is all worth it x1000.
 
Upvote
16 (17 / -1)

Jharm

Wise, Aged Ars Veteran
183
My gut feeling is that the free local tools are about to suddenly go bye-bye. There’s no way that the industry can afford to keep a) absorbing training costs and b) undercutting their own business models now that the bubble is finally popping.
I believe models developed in EU is mostly based on public eu funding meaning free models. So maybe some models will not be free to run locally but I believe some will stay free.
 
Upvote
7 (7 / 0)

JoHBE

Ars Praefectus
4,393
Subscriptor++
I agree with your general principle about technological improvement, but 2005 is an odd date to choose. In fact, I think someone in 2005 would be disappointed by today's computing hardware.

In 1985, Apple fans were likely using an Apple IIe with 64 KB of RAM, a 1MHz processor, and an 80-column text display. A 20MB hard disk was an expensive upgrade.

Twenty years later, in 2005, the iMac G5 had a 2GHz processor, 512 MB RAM, and a 250 GB HDD. That is a 8192x increase in RAM, a 2000x increase in clock speed, and a 12500x increase in disk capacity. This was achieved by a steady and constant improvement in computing power between 1985 and 2005.

Just extrapolating out the trend, someone in 2005 might expect that by 2025, mainstream computers would feature a 4000GHz processor with 4TB RAM and a 3000TB of storage.

In reality, 2005 is around the time that computing performance improvements began to grind to a halt.

It will take some time for the current and previous generations to adjust to the free-lunchless new reality without constant semiconductor-scaling. Progress now depends on a string of new tricks working out as hoped, and we'll have to see how lucky we will be. Maybe we enter something like tech Dark Ages in the near future.
 
Upvote
4 (4 / 0)
Beautiful! I hope it goes up in price by 100 times. It is the only way to stop the enslopification.

Let's see how the companies currently shoving "AI" down their employees throats change their tune. It is after all the company revolutionising technology that we have to use in literally everything, right? Great! Pay for it.
 
Upvote
13 (13 / 0)

Wheels Of Confusion

Ars Legatus Legionis
75,949
Subscriptor
I wonder if at some level the model was intended to be "make an AGI and then get the AGI to figure out how to make itself cheaper".
This was the entire premise of "The Singularity" that popular AI and Transhumanist boosters were pushing on the public for decades. The whole thing is "Once we have human-level artificial intelligence, we'll have it design better-than-human artificial intelligence, and then that will make even smarter bots, and upwards asymptotically, solving all the world's problems with genius artificial minds along the way..."
How much it figured into actual business models besides luring in angel investors early on, I couldn't say.
 
Upvote
19 (19 / 0)

JoHBE

Ars Praefectus
4,393
Subscriptor++
I'm an engineering director at a biggish tech company that you have probably heard of (I started out as a developer ~20y ago, so don't hate me), and -- like pretty much every other company in the industry -- we're under immense pressure from our board / major shareholders (and, through them, the C-level) to adopt agentic AI to the maximum possible extent. Nobody is saying it out loud here yet, but this is very obviously going to lead to what we euphemistically call "savings", i.e. significant reductions in headcount. The tools (we're mainly using Claude Code) are undeniably useful, but the downsides scare me.

Firstly, we're aiming for 100% of our code to be written by AI (the new mantra is "code is free" 🙄), with our more senior engineers acting as reviewers, not authors. This pretty much dooms our junior engineers. We're going to be kicking out the bottom two or three rungs of the career ladder for software developers. The obvious problem here is that senior engineers only exist because they were once junior engineers -- if the talent pipeline is destroyed, there won't be any senior engineers in future.

Secondly, as has been well-discussed in this thread, the costs of the tools are certain to increase (a lot); nobody seems to have any idea of what the upper bound on costs is likely to be, and everybody assumes that the economics will still somehow make sense. There is a lot of suspension of disbelief / magical thinking going on among the boards/C-level across the industry.

Thirdly, there is an implicit assumption that a company's profitability is directly related to the amount of code it can generate -- 10x more code must mean 10x more revenue, right? Right? This seems laughably naive -- many/most of the other processes involved in getting a product to market (marketing, legal/contracting, the sales process, etc) are far harder to accelerate with AI than raw code generation. Even if we could use AI to bring products to market 10x faster, the market isn't suddenly going to grow to accommodate all these exciting new products.

Very selfishly, I'm glad that I'm closer to the end of my career than to the start. I hope we'll find a way through this without leaving a huge number of people on the scrapheap, but I'm not optimistic.

You know what's REALLY alarming? Your 20 years of experience are totally unnecessary to be able to raise those "concerns". They are blatantly obvious for anyone with half a brain who thinks through it for 5 minutes.
 
Upvote
31 (31 / 0)

Wheels Of Confusion

Ars Legatus Legionis
75,949
Subscriptor
You know what's REALLY alarming? Your 20 years of experience are totally unnecessary to be able to raise those "concerns". They are blatantly obvious for anyone with half a brain who thinks through it for 5 minutes.
That's not the scary part. The scary part is that their leadership clearly doesn't match your description despite it being their literal job.
 
Upvote
14 (14 / 0)

Erbium168

Ars Centurion
2,964
Subscriptor
God forbid people realize the cost of the thing they've been using. This was, of course, inevitable.

All that VC / hyperscaler money that had been getting burnt on customer acquisition and experimenting was going to demand ROI at some point.. and that point is now.

2026: The year the AI bill came due.

It turns out you can't run a massive cash burn hopes-and-dreams machine forever without consequences. It's no small coincidence that that OpenAI / Anthropic are rushing to IPO before the costs catch up with everyone waking up to the real costs of the LLMs.
If you assume it was a scam from the beginning (bear with me)
¦- Sell AI at a very low price and persuade people to lay off lots of lower level employees.
¦- Increase prices enormously so it is just as expensive as those employees but without the flexibility.
¦- We're not a monopoly, you have choices.
¦- You think the employees will come back? I have a nice ferry in New York going cheap.

It's the same as the supermarket/mall thing:
¦-Open up, sell at low prices
¦-Drive local shopkeepers out of business.
¦-Town centre decays but you buy it up cheap.
¦-Put up prices.
¦-Nobody goes back to town centre because you are charging very high rents.
De facto monopoly.
 
Upvote
17 (18 / -1)

thinkreal

Ars Scholae Palatinae
700
Is enshittification a good thing if it triggers a desloppification?
Are you equating charging a market price for services as enshitification?
I would think that is when any prompt results in one of three trending responses, two being advertising and one is about Trump
 
Upvote
-1 (2 / -3)
If China is smart, they will do their utmost best to keep providing open weights that approach frontline models as closely as possible. The ROI for them is all worth it x1000.
I’m sure the CCP is spending a fortune to make it trivially easy for lazy, ignorant people to generate reams of code they don’t understand and will never audit purely out of the goodness of their hearts.
 
Upvote
11 (11 / 0)
IMO there will always be a free tier. That's what drives demand from new users, but we're definitely seeing a push to move more people into the paid tiers. I use AI quite a bit despite being retired but so far all of my usage has been in the free tier. I still find it useful at that tier and so far have not used it for anything I would be willing to pay for.

For me to pay for it, it has to either save me significant time, or it has to do something I can't do manually, or it has to make me money (more than I paid for the AI use).

I suspect that over time, most of the free tier will migrated to local AI run on your own personal device. We'll probably have an add supported web tier as well (Google already does with Gemini).
 
Upvote
-1 (1 / -2)

fyo

Ars Tribunus Militum
1,728
The new usage-based billing frustrations are being massively compounded by some rather egregious bugs on Microsoft's end. In VSCode, the "open in agents" feature auto-switches to specific models, even if you manually set it to something else. Additionally, I'm currently unable to even select a non-Claude model in the "chat" mode.

This is compounded by some apparent bugs in the "open in agents" (which appears to be Microsoft's attempt at a Claude Code) that cause token usage to absolute explode. Making queries in Copilot is currently consuming roughly 10x the number of tokens as the same request directly in Claude Code (both within VSCode). Runtime for Copilot also seems to have increased substantially. This could all just be an "open in agents" bug (and might have been there since its introduction - we just haven't noticed due to the billing being the way it was). I haven't really used the chat sidebar for a while, so it might have similar token usage to Claude Code.

Copilot is completely useless right now, but Claude Code using the same token pricing mechanism (ostensibly) works very well for me.
 
Upvote
9 (9 / 0)
Inefficient tokenmaxxing design of the context window where it scans your entire chat every time you prompt it instead of simplifying to a smaller summary and using that context instead is driving up costs for the consumer.

There is still major engineering to optimize the LLM stack, and they want us burning even more with autonomous agents? Where is the money coming from for this?

Its bad enough when I'm simply wasting my time with a bad outcome in a vibe coded SQL query that doesnt run or gives me the wrong result and I have to QA for a while. If that blew through my budget for the month I'm not going to be satisfied to keep paying for this slop.
I hate when my prompts are "compacted" during long chats, specially during the planning phase. The first thing to get dropped as irrelevant is somehow, always, something critical or important to the design. Or some nuanced view on something that'll have severe impact downstream. Nope. Much easier to spend extra time rewriting the original prompt to incorporate elements from the first round of results and start anew.
 
Upvote
8 (8 / 0)

fyo

Ars Tribunus Militum
1,728
I hate when my prompts are "compacted" during long chats, specially during the planning phase. The first thing to get dropped as irrelevant is somehow, always, something critical or important to the design. Or some nuanced view on something that'll have severe impact downstream. Nope. Much easier to spend extra time rewriting the original prompt to incorporate elements from the first round of results and start anew.

The structure around the LLMs (the Claude Code, Copilot, etc bits) still require a massive amount of development. Forgetting things, not adhering to workflows, etc. It's frustrating. You can get some of the way there with various .md files where you can store things that shouldn't get compacted (further). Still a long way to go, though.
 
Upvote
1 (1 / 0)
The new usage-based billing frustrations are being massively compounded by some rather egregious bugs on Microsoft's end. In VSCode, the "open in agents" feature auto-switches to specific models, even if you manually set it to something else. Additionally, I'm currently unable to even select a non-Claude model in the "chat" mode.

This is compounded by some apparent bugs in the "open in agents" (which appears to be Microsoft's attempt at a Claude Code) that cause token usage to absolute explode. Making queries in Copilot is currently consuming roughly 10x the number of tokens as the same request directly in Claude Code (both within VSCode). Runtime for Copilot also seems to have increased substantially. This could all just be an "open in agents" bug (and might have been there since its introduction - we just haven't noticed due to the billing being the way it was). I haven't really used the chat sidebar for a while, so it might have similar token usage to Claude Code.

Copilot is completely useless right now, but Claude Code using the same token pricing mechanism (ostensibly) works very well for me.
I certainly cannot imagine a Microsoft product having a bug that causes the amount of money paid from you to Microsoft inside the product to be ten times what it would cost if paid via other channels. Microsoft is not capable of that level of {cunning; incompetence; malevolence; stupidity; fiendishness}. You must be imagining it.
 
Upvote
10 (10 / 0)
That's not the scary part. The scary part is that their leadership clearly doesn't match your description despite it being their literal job.
That's nothing really new though. Ed Zitron calls them business idiots (though I think he got that term from somewhere else). The business world is filled with them. Especially at larger companies where size and momentum can paper over a lot of incompetence.
 
Upvote
13 (13 / 0)
Post content hidden for low score. Show…

rell

Wise, Aged Ars Veteran
161
LAWL and so ends the era of "vibe coders". Let's see how long "AI" lasts now that the real cost is out there for consumers (I don't view LLM as AI, as there's no thought, no intelligence, no actual reasoning, only a perception of it). It was already unsustainable in power and hardware consumption alone. Communities have begun pushing back on backyard data centers due to the rising costs of their own bills and failure to produce meaningful jobs. There's not enough power in the world to run these things. Hardware burns out within a couple years, never even making it to its "lifetime". Etc.

For home coders, the better shift is probably to well trained SLMs that don't rely on the large power hungry cloud providers. Not everything needs an LLM. If you've been a real dev for awhile (read, you CAN write it on your own, you know the shape of what you're trying to solve, it'll just accelerate you knocking out all the boiler plate), you can be plenty effective using smaller models.

Ars, it'd be cool to see a good in-depth article that reviews and runs through the small language models people can run at home on their own hardware. :)
 
Upvote
4 (4 / 0)

Hagen Stein

Ars Scholae Palatinae
696
Subscriptor
I believe very strongly the bubble will pop when prices are raised and the quality doesn't increase. We're really close to that limit now. I just hope people will realize it popped after November.
I think OpenAI's and Anthropic's IPOs will tell us more. For now, we need to believe the numbers they publish. But then they have to disclose them for real.
 
Upvote
-4 (0 / -4)

Madestjohn

Ars Tribunus Angusticlavius
7,802
No I didn't. How can something that's posted for free, online, available to download for anyone, be going away? You do realize that local models are different then the free cloud-based models, right?
Because I’m nearly sixty and have seen countless free stuff online ‘disappear’
 
Upvote
10 (11 / -1)
Ars, it'd be cool to see a good in-depth article that reviews and runs through the small language models people can run at home on their own hardware. :)
I recommend downloading GPT4All or Jan and just messing around with some models from Hugging Face. There's always new models out there, and there are literally thousands that can run on a modern mid-range consumer notebook (i.e. MacBook Pro M5)
 
Upvote
2 (2 / 0)

GrandTheftOttoman

Smack-Fu Master, in training
17
Isn't Musk reputed to have a problem in that department?
A botched gender-confirming surgery left it deformed, which is why he couldn't get people to touch it even if he offered to buy them a horse. Also explains why his children are test-tube babies with names from their vials (X-AE88 or whatever).
 
Upvote
-7 (1 / -8)
I'm surprised by how few engineers and AI proponents are using local LLMs in tools such as LM Studio or Ollama. For many coding and analysis tasks, open models (like GPT-OSS-20B) are very strong and, while not at the top of the list on comparison charts such as AI Arena) are "close enough" for a great many tasks - OSS-20B, for example, ranks almost identically to GPT-4o-Mini.

As others have noted, there have been great strides with smaller models, and something like OSS-20B runs very well (45+ tps) on an older MacBook Pro M2 with 32 GB RAM, and cost me literally nothing for 10k output tokens of "build me a fully-functioning ASCII-based chess program in Python, including en passant and castling, and include a small opening book".

This sticker shock is not surprising, especially for anyone in an organization working with Claude Code, so for price-sensitive folks? I think the halcyon days of near-unlimited cheap frontier-model use may be over.
 
Upvote
4 (5 / -1)
Because I’m nearly sixty and have seen countless free stuff online ‘disappear’
Have you ever been to Hugging Face? You can find the original OSS-20b model at https://huggingface.co/openai/gpt-oss-20b . Because it's free and open source, people take the model, and create their own versions of it.

I have never seen an active open source project that was pulled off line like that. Can you provide a link to one?
 
Upvote
5 (5 / 0)

Bongle

Ars Praefectus
4,495
Subscriptor++
I saw a family member on the weekend, and his company was going to move to token-based-billing starting yesterday. He said it was going to be a total shitshow.

Prediction of what comes next:
1) Every company's AI bill for this quarter is going to be insanely huge. They will start clamping down on spending over the next few quarters because effectively doubling your headcount costs for a not-measurable increase in productivity is not good for business.
2) Anthropic, having just started filing for their IPO, will get to appear profitable for this quarter and will point out "LOOK AT THIS INSANE GROWTH IN JUST ONE QUARTER. WE'RE EVEN PROFITABLE!"
3) Anthropic will have a massive, record-shattering IPO. Everyone involved will dump their shares and get very, very rich.
4) All the companies will say "holy shit it's cheaper just to hire fresh grads"
5) Pop!
 
Upvote
12 (12 / 0)

arsisloam

Ars Scholae Palatinae
1,411
Subscriptor
Nitpicking maybe, but I'd put it a bit later, when multi-core 64 bit processors became really mainstream, maybe around 2008ish. Of course there's also the argument that SSDs becoming standard at virtually all levels was a real performance game changer, which would be around 2015ish. I've got 2012 era hardware which has been given a whole new lease of life by swapping the HDDs for SSDs. It really depends if you're focusing on specifics like CPU core speed or overall system performance.
Hah I'm still running a 2012 PC too. It's on its 2nd CPU and 3rd HDD. It's no longer my primary, but it's fine as a living room PC.
 
Upvote
0 (0 / 0)
And my employer is pushing for us to use AI to do more coding. I wonder if they’ll change their tune when the full bill comes due?
Oh, I bet they will, especially for smaller places and some start-ups. Maybe migration to cheaper models will ensue, but the true market competitive nature of agentic programming services looks like it's about to get underway for real, now.
 
Upvote
0 (0 / 0)