AI costs how much? GitHub Copilot users react to new usage-based pricing system

crmarvin42 · 2026-06-02T14:22:27-0400

Prostetnic said:
It's odd that my AI costs have not changed at all. They are still stubbornly stuck at $0.00

Last month I received a MS Teams message from my boss, asking us who might want a subscription to Co-pilot. I'd already seen that they were going to start charging on a per-token basis, and had already come to the conclusion that LLMs are not reliable enough for the things I might want to have someone else do for me, and so passed on the offer. I wonder much my colleagues who did ask for a subscription are actually using theirs. I have spending authority for the R&D money, so I think I need to call my college in the lab and see what it's costing her this month.

JoHBE · 2026-06-02T14:33:37-0400

AI_Skeptic said:
Once a model, or a software program, is open source, and people start using the software and creating forks, I never seen the software program "fade away". At worst it just goes on archive.org

There's another understanding of "them going away", which is that the hyperscalers might "plateau" the capabilities of free models they release, while the frontline cloud models keep improving. of course for this to become a real issue, the frontline models will have to keep improving significantly. And even then, a country/regime like China could see a lot of value in undercutting US hyperscalers by ignoring that whole strategy and continue to open-source something that is 90% there.

winstonsmith84 · 2026-06-02T14:37:42-0400

I remember when I used to be able to buy RAM. What a time that was to be alive.

JoHBE · 2026-06-02T14:41:55-0400

winstonsmith84 said:
I remember when I used to be able to buy RAM. What a time that was to be alive.

You were probably born with two kidneys, with only ONE strictly necessary, so what's your problem, exactly?

Bongle · 2026-06-02T14:47:43-0400

Geebs said:
Hugging Face is VC-backed to the tune of several hundred million dollars as well as an undisclosed amount from Amazon and Meta. It’s revenue is in the tens of millions of dollars i.e. it’s incredibly unprofitable. Model weights run from tens to hundreds of gigabytes per download. Serving that amount of data is expensive. Some guy with a patreon isn’t going to be able to host these models if the big boys decide they don’t want to share any more.

You could torrent them or do other distributed-download systems. If, in this hypothetical, the big guys are cracking down on open-weights models to help armor their monopolies, then it seems like that torrents would be basically necessary.

That said I'd guess one of the big players would just buy huggingface-alike, a la rich folks buying private-jet trackers and then shutting them down.

Anoff · 2026-06-02T15:04:29-0400

I've never even come close to using my Claude Max allocation, despite pretty significant development work, plus quite a bit of chat, but I've been eyeing a 5080 or 5090 to move my coding locally - I have ~$140 monthly spend on AI tools, and even if I have to keep the v0 sub (it's a next app/site visual design tool), my payback period is under a year, maybe a few months more if I'm factoring in the electricity cost.

That said, even though performance, in terms of throughput, might not be noticeably affected, Claude is being improved constantly, while a local model puts the onus on me for upgrades and additional training, which certainly isn't nothing. I'm obviously cost sensitive, but I'm working on client projects that will eventually hit production, so quality matters a lot as well, and I can pass some, if not all, of the cost on to the client (each client currently has a $50/mn software fee to cover the various web services, software licenses, plugins, etc, for their sites, so bumping that a few bucks is always an option).

I just don't want to be late adapting when the other foot drops; the worst spot to be would be having no GPU and nothing configured and receiving an email that my Claude sub is now going to be $500 or $1000 a month

AI_Skeptic · 2026-06-02T15:06:47-0400

JoHBE said:
which is that the hyperscalers might "plateau" the capabilities of free models they release, while the frontline cloud models keep improving

For general purposes, I think the OSS-GPT-20B is good enough. It's good enough to write letters, proofread posts, and things that most people use GenAI for. It can also do some programming as well. The frontier models are more powerful, but uses much more energy.

s73v3r · 2026-06-02T15:07:25-0400

jeffbax said:
Sorry but this framing is disingenuous. Before there were no agents doing actual work, and now there are.

That’s the rub.

The previous plans were not built for what these things are capable of doing now (aka building apps end to end)

Citation Needed.

TechnicaGratiaArs · 2026-06-02T15:08:05-0400

SlowmoDojo said:
Workers will be replaced by AI, businesses will adopt it and it will become essential for their processes, and then the price will increase dramatically until it’s almost/just as expensive as it used to be when meatbags did the work. Except now all that money is being siphoned into a handful of megacorporations instead of going to countless millions of workers in the form of salaries.

This was always the plan. Like Uber burning VC cash to subsidise rides until the taxi industry died, then jacking up the prices to reach profitability.

The billionaires and future-trillionares are no longer content with selling you products and services. They’re now literally coming for your entire salary.

Salary? The billion/trillionaires are coming for your person as property. They'll then eliminate most of us to increase the value of the ones that remain and cut their costs on the rest.

fyo · 2026-06-02T15:17:24-0400

Just to stick a BOTE calculation for the best open weights model for coding (not quite Opus, but better than pretty much everything else for anything not multimodal):

$50k/month for e.g. Coreweave cloud solution (+60% for AWS) consisting of an 8xB200 node with all the trimmings. 120 concurrent users going full-bore, many times that with non-maxed usage rates, especially if you can tolerate a slight drop in throughput. That's less than $500 / month / user, for the absolute worst case scenario with 24/7 availability and every user maxing their usage 100% of the time.

If you were willing to commit to a year in advance, you could get an AWS for maybe as low as $40k / month (including data transport and storage fees). An AI specialist would be even cheaper with that commitment.

Obviously not cheap, but not unrealistic for a company with a large dev operation - or a group of startups.

jock2nerd · 2026-06-02T15:19:53-0400

TheMolesRevenge said:
Traditionally, the best way to make money in a gold rush is to sell tools

Unfortunately, and as nVidia and Micron and their kin are about to find out, when the gold rush runs its course, all that sweet money made selling tools, whisky and whores can stop very suddenly.

nixonismyhero · 2026-06-02T15:34:04-0400

I love how all these models are using "tokens" to obscure the actual dollar amount. Pretty great marketing when nobody is talking about a query being X amount of cash, but instead using a somewhat inscrutable digital token that people can't easily quantify.

AI_Skeptic · 2026-06-02T15:36:48-0400

nixonismyhero said:
I love how all these models are using "tokens" to obscure the actual dollar amount. Pretty great marketing when nobody is talking about a query being X amount of cash, but instead using a somewhat inscrutable digital token that people can't easily quantify.

Sadly, Tokens is what is used, and one token is not equal to one word.

jock2nerd · 2026-06-02T15:37:34-0400

Anoff said:
I've never even come close to using my Claude Max allocation, despite pretty significant development work, plus quite a bit of chat, but I've been eyeing a 5080 or 5090 to move my coding locally - I have ~$140 monthly spend on AI tools, and even if I have to keep the v0 sub (it's a next app/site visual design tool), my payback period is under a year, maybe a few months more if I'm factoring in the electricity cost.

That said, even though performance, in terms of throughput, might not be noticeably affected, Claude is being improved constantly, while a local model puts the onus on me for upgrades and additional training, which certainly isn't nothing. I'm obviously cost sensitive, but I'm working on client projects that will eventually hit production, so quality matters a lot as well, and I can pass some, if not all, of the cost on to the client (each client currently has a $50/mn software fee to cover the various web services, software licenses, plugins, etc, for their sites, so bumping that a few bucks is always an option).

I just don't want to be late adapting when the other foot drops; the worst spot to be would be having no GPU and nothing configured and receiving an email that my Claude sub is now going to be $500 or $1000 a month

Actually the worst spot is when the subscription (effectively) goes away, you are charged for usage and paying several thousand dollars a month, which might enough to justify hiring someone to control usage and minimize LLM costs.
...probably not enough to justify for one person, but for a team of people each potentially spending several thousand dollars, it'd be a non-brainer to justify hiring someone, for each team, to control the usage.

JohnDeL · 2026-06-02T15:39:42-0400

AI_Skeptic said:
Sadly, Tokens is what is used, and one token is not equal to one word.

So ... some tokens are more equal than others?

Hispalensis · 2026-06-02T15:48:47-0400

If you assume that the electricity cost of 1 kWh is $0.15, running something that uses 1kW of power for a month is roughly $100. That's less than 2xH100. If doing inference in GPT-5 takes 10 gpus, that's $500/month. That's just the cost of running the GPUs, no operating costs, no cooling costs, for a single inference point.

This is what is going to pop the bubble, current AI usage is so subsidized that unless you develop much better smaller models or efficient agents, most of the current uses are simply not economically viable.

RuralNinja · 2026-06-02T16:00:04-0400

You love to see it. This bubble can't burst soon enough.

TheMatrixHoosier · 2026-06-02T16:00:53-0400

Boskone said:
But is one actually cheaper or more expensive than the other to run, as opposed to the currently-charged price?

That's the issue at question, I think. If AI 1 raises prices then people go to AI 2; AI 2 raises prices and they go to 3; etc. Do they eventually run out of "cheap" AI?

(Someone up-thread mentioned Deepseek being cheaper per token, but I have to wonder if they're trying the same playbook with AI as rare-earths.)

I suspect DeepSeek V4 is orders of magnitude cheaper in part because it's still a text-based model, whereas all the US frontiers are MLLMs. It's a model that doesn't use all the bells and whistles, while our companies run ever more expensive models. Chinese MLLMs are cheaper but not on the scale of DeepSeek's latest price cut--see Kimi or GLM 5. Many US companies may have reservations about using any of these companies though...

Not that the logic for US companies is much stronger... how does one trust a business that is literally built on IP they never paid for? "Hey, we took everything we could using any means possible and trained on it, but pinky promise, we won't use your data." I run academic studies for publication in the public domain, so it's not risk I take too seriously, but if I ran a business with important proprietary information, I wouldn't let it touch any of these companies.

There are great open models that can be run in the cloud for a fraction of the cost of the frontier models. They don't come with the tooling, plug-ins, and easy-to-use interface of the VC-funded frontier models, but might be a serious contender for a business with a real, genuine use case for deploying LLMs and the engineering talent to put it in place.

fyo · 2026-06-02T16:04:14-0400

Hispalensis said:
If you assume that the electricity cost of 1 kWh is $0.15, running something that uses 1kW of power for a month is roughly $100. That's less than 2xH100. If doing inference in GPT-5 takes 10 gpus, that's $500/month. That's just the cost of running the GPUs, no operating costs, no cooling costs, for a single inference point.

This is what is going to pop the bubble, current AI usage is so subsidized that unless you develop much better smaller models or efficient agents, most of the current uses are simply not economically viable.

Citation needed.

For actual numbers, industry publications like SemiAnalysis¹ estimate Anthropic's margins at 70% this year, up from about half that last year.

We can debate what's included in these numbers, but these models aren't as massively subsidized as people seem to think. Microsoft had a terrible pricing structure, but that's not anyone else's fault.

Vastin · 2026-06-02T16:10:16-0400

Wait, you mean trillions of dollars of infrastructure doesn't magically pay for itself? The sticker shock has only just begun. This is the usage rates they are using to 'test the waters' - they can't afford to spook users with the full pricing until after their upcoming IPOs.

hanharal · 2026-06-02T16:13:43-0400

Resistance said:
The key term in the post you quoted is "Local LLMs", cloud LLMs are "better" than Local LLMs today, but Local LLMs have matched the course and speed of cloud LLMs pretty well, there is no reason to believe this trend will not continue.

A model which can be run locally must still be trained by someone. Today companies like Meta, Deepseek, Mistral and others do this. Will they continue forever?

bbottema · 2026-06-02T16:14:28-0400

I'm one of the power users of Github Copilot, I've been using it on some projects of my own extensively. Frankly speaking, it's been crazy how many hours continuously you could use it previously with models like Sonnet (medium) for a low flat fee price. They nerfed it and indeed I also now burned through a 'month' worth of credit in a single day. However, I've upgraded to the 100 bucks plan and it seems I can again spend a crazy number of hours/tokens before that AI credits budget is gone.

Had I been on the pay-as-you-go scheme, it would have been a small fortune of cost. I'm not surprised they are moving their sales targets.

AI costs how much? GitHub Copilot users react to new usage-based pricing system

Ars Praefectus

Ars Praefectus

Seniorius Lurkius

Ars Praefectus

Ars Praefectus

Smack-Fu Master, in training

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Praetorian

Ars Tribunus Militum

Ars Praefectus

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Ars Praefectus

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Ars Praetorian

Smack-Fu Master, in training

Ars Tribunus Militum

Smack-Fu Master, in training

Ars Centurion

Seniorius Lurkius