AI costs how much? GitHub Copilot users react to new usage-based pricing system

crmarvin42

Ars Praefectus
3,229
Subscriptor
It's odd that my AI costs have not changed at all. They are still stubbornly stuck at $0.00
Last month I received a MS Teams message from my boss, asking us who might want a subscription to Co-pilot. I'd already seen that they were going to start charging on a per-token basis, and had already come to the conclusion that LLMs are not reliable enough for the things I might want to have someone else do for me, and so passed on the offer. I wonder much my colleagues who did ask for a subscription are actually using theirs. I have spending authority for the R&D money, so I think I need to call my college in the lab and see what it's costing her this month.
 
Upvote
3 (3 / 0)

JoHBE

Ars Praefectus
4,393
Subscriptor++
Once a model, or a software program, is open source, and people start using the software and creating forks, I never seen the software program "fade away". At worst it just goes on archive.org

There's another understanding of "them going away", which is that the hyperscalers might "plateau" the capabilities of free models they release, while the frontline cloud models keep improving. of course for this to become a real issue, the frontline models will have to keep improving significantly. And even then, a country/regime like China could see a lot of value in undercutting US hyperscalers by ignoring that whole strategy and continue to open-source something that is 90% there.
 
Upvote
1 (1 / 0)

Bongle

Ars Praefectus
4,495
Subscriptor++
Hugging Face is VC-backed to the tune of several hundred million dollars as well as an undisclosed amount from Amazon and Meta. It’s revenue is in the tens of millions of dollars i.e. it’s incredibly unprofitable. Model weights run from tens to hundreds of gigabytes per download. Serving that amount of data is expensive. Some guy with a patreon isn’t going to be able to host these models if the big boys decide they don’t want to share any more.
You could torrent them or do other distributed-download systems. If, in this hypothetical, the big guys are cracking down on open-weights models to help armor their monopolies, then it seems like that torrents would be basically necessary.

That said I'd guess one of the big players would just buy huggingface-alike, a la rich folks buying private-jet trackers and then shutting them down.
 
Upvote
2 (2 / 0)

Anoff

Smack-Fu Master, in training
55
I've never even come close to using my Claude Max allocation, despite pretty significant development work, plus quite a bit of chat, but I've been eyeing a 5080 or 5090 to move my coding locally - I have ~$140 monthly spend on AI tools, and even if I have to keep the v0 sub (it's a next app/site visual design tool), my payback period is under a year, maybe a few months more if I'm factoring in the electricity cost.

That said, even though performance, in terms of throughput, might not be noticeably affected, Claude is being improved constantly, while a local model puts the onus on me for upgrades and additional training, which certainly isn't nothing. I'm obviously cost sensitive, but I'm working on client projects that will eventually hit production, so quality matters a lot as well, and I can pass some, if not all, of the cost on to the client (each client currently has a $50/mn software fee to cover the various web services, software licenses, plugins, etc, for their sites, so bumping that a few bucks is always an option).

I just don't want to be late adapting when the other foot drops; the worst spot to be would be having no GPU and nothing configured and receiving an email that my Claude sub is now going to be $500 or $1000 a month
 
Upvote
0 (1 / -1)
which is that the hyperscalers might "plateau" the capabilities of free models they release, while the frontline cloud models keep improving
For general purposes, I think the OSS-GPT-20B is good enough. It's good enough to write letters, proofread posts, and things that most people use GenAI for. It can also do some programming as well. The frontier models are more powerful, but uses much more energy.
 
Upvote
0 (0 / 0)
Workers will be replaced by AI, businesses will adopt it and it will become essential for their processes, and then the price will increase dramatically until it’s almost/just as expensive as it used to be when meatbags did the work. Except now all that money is being siphoned into a handful of megacorporations instead of going to countless millions of workers in the form of salaries.

This was always the plan. Like Uber burning VC cash to subsidise rides until the taxi industry died, then jacking up the prices to reach profitability.

The billionaires and future-trillionares are no longer content with selling you products and services. They’re now literally coming for your entire salary.

Salary? The billion/trillionaires are coming for your person as property. They'll then eliminate most of us to increase the value of the ones that remain and cut their costs on the rest.
 
Upvote
3 (3 / 0)

fyo

Ars Tribunus Militum
1,728
Just to stick a BOTE calculation for the best open weights model for coding (not quite Opus, but better than pretty much everything else for anything not multimodal):

$50k/month for e.g. Coreweave cloud solution (+60% for AWS) consisting of an 8xB200 node with all the trimmings. 120 concurrent users going full-bore, many times that with non-maxed usage rates, especially if you can tolerate a slight drop in throughput. That's less than $500 / month / user, for the absolute worst case scenario with 24/7 availability and every user maxing their usage 100% of the time.

If you were willing to commit to a year in advance, you could get an AWS for maybe as low as $40k / month (including data transport and storage fees). An AI specialist would be even cheaper with that commitment.

Obviously not cheap, but not unrealistic for a company with a large dev operation - or a group of startups.
 
Upvote
0 (0 / 0)

jock2nerd

Ars Praefectus
4,807
Subscriptor
Traditionally, the best way to make money in a gold rush is to sell tools
Unfortunately, and as nVidia and Micron and their kin are about to find out, when the gold rush runs its course, all that sweet money made selling tools, whisky and whores can stop very suddenly.
 
Upvote
2 (2 / 0)

AI_Skeptic

Wise, Aged Ars Veteran
363
I love how all these models are using "tokens" to obscure the actual dollar amount. Pretty great marketing when nobody is talking about a query being X amount of cash, but instead using a somewhat inscrutable digital token that people can't easily quantify.
Sadly, Tokens is what is used, and one token is not equal to one word.
 
Upvote
2 (2 / 0)

jock2nerd

Ars Praefectus
4,807
Subscriptor
I've never even come close to using my Claude Max allocation, despite pretty significant development work, plus quite a bit of chat, but I've been eyeing a 5080 or 5090 to move my coding locally - I have ~$140 monthly spend on AI tools, and even if I have to keep the v0 sub (it's a next app/site visual design tool), my payback period is under a year, maybe a few months more if I'm factoring in the electricity cost.

That said, even though performance, in terms of throughput, might not be noticeably affected, Claude is being improved constantly, while a local model puts the onus on me for upgrades and additional training, which certainly isn't nothing. I'm obviously cost sensitive, but I'm working on client projects that will eventually hit production, so quality matters a lot as well, and I can pass some, if not all, of the cost on to the client (each client currently has a $50/mn software fee to cover the various web services, software licenses, plugins, etc, for their sites, so bumping that a few bucks is always an option).

I just don't want to be late adapting when the other foot drops; the worst spot to be would be having no GPU and nothing configured and receiving an email that my Claude sub is now going to be $500 or $1000 a month
Actually the worst spot is when the subscription (effectively) goes away, you are charged for usage and paying several thousand dollars a month, which might enough to justify hiring someone to control usage and minimize LLM costs.
...probably not enough to justify for one person, but for a team of people each potentially spending several thousand dollars, it'd be a non-brainer to justify hiring someone, for each team, to control the usage.
 
Upvote
0 (0 / 0)

Hispalensis

Ars Tribunus Militum
1,924
Subscriptor
If you assume that the electricity cost of 1 kWh is $0.15, running something that uses 1kW of power for a month is roughly $100. That's less than 2xH100. If doing inference in GPT-5 takes 10 gpus, that's $500/month. That's just the cost of running the GPUs, no operating costs, no cooling costs, for a single inference point.

This is what is going to pop the bubble, current AI usage is so subsidized that unless you develop much better smaller models or efficient agents, most of the current uses are simply not economically viable.
 
Upvote
1 (1 / 0)

TheMatrixHoosier

Smack-Fu Master, in training
2
But is one actually cheaper or more expensive than the other to run, as opposed to the currently-charged price?

That's the issue at question, I think. If AI 1 raises prices then people go to AI 2; AI 2 raises prices and they go to 3; etc. Do they eventually run out of "cheap" AI?

(Someone up-thread mentioned Deepseek being cheaper per token, but I have to wonder if they're trying the same playbook with AI as rare-earths.)
I suspect DeepSeek V4 is orders of magnitude cheaper in part because it's still a text-based model, whereas all the US frontiers are MLLMs. It's a model that doesn't use all the bells and whistles, while our companies run ever more expensive models. Chinese MLLMs are cheaper but not on the scale of DeepSeek's latest price cut--see Kimi or GLM 5. Many US companies may have reservations about using any of these companies though...

Not that the logic for US companies is much stronger... how does one trust a business that is literally built on IP they never paid for? "Hey, we took everything we could using any means possible and trained on it, but pinky promise, we won't use your data." I run academic studies for publication in the public domain, so it's not risk I take too seriously, but if I ran a business with important proprietary information, I wouldn't let it touch any of these companies.

There are great open models that can be run in the cloud for a fraction of the cost of the frontier models. They don't come with the tooling, plug-ins, and easy-to-use interface of the VC-funded frontier models, but might be a serious contender for a business with a real, genuine use case for deploying LLMs and the engineering talent to put it in place.
 
Upvote
0 (0 / 0)

fyo

Ars Tribunus Militum
1,728
If you assume that the electricity cost of 1 kWh is $0.15, running something that uses 1kW of power for a month is roughly $100. That's less than 2xH100. If doing inference in GPT-5 takes 10 gpus, that's $500/month. That's just the cost of running the GPUs, no operating costs, no cooling costs, for a single inference point.

This is what is going to pop the bubble, current AI usage is so subsidized that unless you develop much better smaller models or efficient agents, most of the current uses are simply not economically viable.

Citation needed.

For actual numbers, industry publications like SemiAnalysis¹ estimate Anthropic's margins at 70% this year, up from about half that last year.

We can debate what's included in these numbers, but these models aren't as massively subsidized as people seem to think. Microsoft had a terrible pricing structure, but that's not anyone else's fault.
 
Upvote
0 (0 / 0)
The key term in the post you quoted is "Local LLMs", cloud LLMs are "better" than Local LLMs today, but Local LLMs have matched the course and speed of cloud LLMs pretty well, there is no reason to believe this trend will not continue.

A model which can be run locally must still be trained by someone. Today companies like Meta, Deepseek, Mistral and others do this. Will they continue forever?
 
Upvote
0 (0 / 0)

bbottema

Seniorius Lurkius
47
Subscriptor++
I'm one of the power users of Github Copilot, I've been using it on some projects of my own extensively. Frankly speaking, it's been crazy how many hours continuously you could use it previously with models like Sonnet (medium) for a low flat fee price. They nerfed it and indeed I also now burned through a 'month' worth of credit in a single day. However, I've upgraded to the 100 bucks plan and it seems I can again spend a crazy number of hours/tokens before that AI credits budget is gone.

Had I been on the pay-as-you-go scheme, it would have been a small fortune of cost. I'm not surprised they are moving their sales targets.
 
Upvote
0 (0 / 0)