AMD will bring its “Ryzen AI” processors to standard desktop PCs for the first time

I've yet to meet someone who actually wants a Copilot+ system. Like, I guess in theory they must exist somewhere. But I haven't seen one in person.
Given the recent MS new that they banned Microslop on their Discord then shut it down. We need a simliar rebrand of Copilot to express our disdain.

I vote CoPlop from Microslop to really drive the knife in.
 
Upvote
0 (2 / -2)

islane

Ars Scholae Palatinae
926
Subscriptor
I'm not sold on the use case for an NPU in a desktop. For a mobile device without a discrete GPU I can see it being useful for running AI functions that don't require much performance, but for a desktop only the very bottom tier of desktops are likely to lack a discrete GPU for most consumers.

Last I heard NPUs are FAR weaker at AI tasks than pretty much any GPU. So the only use case I could see would be running a local AI model that requires more memory than the VRAM on your GPU. But for a local model with that sort of demand, will an NPU have enough performance to even be relevant? Can I run that 64GB local model that my 5090 can't run due to memory constraints on my NPU if I have 64GB of system ram? I tend to doubt it. (I don't have a 5090, just making my point).

So from my perspective either an NPU can run a bigger model than a GPU if you have enough system ram in which case it's potentially relevant or it can't in which case it's a waste of silicon that I would rather not pay for.

Are they planning to continue selling desktop CPUs without an NPU? If so then ok non issue. If not then we're looking at AMD raising prices for wasted silicon.

The NPU is weaker, but supposedly far more efficient for the imagined use case. The idea is that an integrated NPU is able to perform a relative (basic) LLM/inference -related task more efficiently than the usual "APU" design of CPU + iGPU. On paper, sure, this checks out. In reality... I think there is little utility for the average consumer or business in the imagined scenario of accelerating compute for a small local model efficiently.

Why is it useless? Well, the tiny local models that could make sense would have to be highly-tailored for any real utility at a size that can fit (like an open 7B model or less). A story by the Economist last month covered this and explained (with survey data) that only a small number of workers use "AI" daily and that the vast majority of these users only use LLMs via cloud providers and as a glorified search / reference system. The tiny local models which fit within 4-16gb are terrible for this particular task - few are web-search/agent enabled, plus small models lack context and hallucinate more.

So the stated goal of power efficiency for a local model is all but pointless when the average model that might actually provide utility is larger than the device can realisticlaly run. Generally speaking, I think you are correct in that NPUs are wasted silicon. This funcitionality would make more sense baked into a power-hungry iGPU on all of these processors - even if that scenario means less efficiency for the rare user that actually needs the NPU functionality. At least the silicon would be more likely to be utilized in that scenario.
 
Upvote
7 (9 / -2)

khumak50

Ars Tribunus Militum
1,573
The NPU is weaker, but supposedly far more efficient for the imagined use case. The idea is that an integrated NPU is able to perform a relative (basic) LLM/inference -related task more efficiently than the usual "APU" design of CPU + iGPU. On paper, sure, this checks out. In reality... I think there is little utility for the average consumer or business in the imagined scenario of accelerating compute for a small local model efficiently.

Why is it useless? Well, the tiny local models that could make sense would have to be highly-tailored for any real utility at a size that can fit (like an open 7B model or less). A story by the Economist last month covered this and explained (with survey data) that only a small number of workers use "AI" daily and that the vast majority of these users only use LLMs via cloud providers and as a glorified search / reference system. The tiny local models which fit within 4-16gb are terrible for this particular task - few are web-search/agent enabled, plus small models lack context and hallucinate more.

So the stated goal of power efficiency for a local model is all but pointless when the average model that might actually provide utility is larger than the device can realisticlaly run. Generally speaking, I think you are correct in that NPUs are wasted silicon. This funcitionality would make more sense baked into a power-hungry iGPU on all of these processors - even if that scenario means less efficiency for the rare user that actually needs the NPU functionality. At least the silicon would be more likely to be utilized in that scenario.
Yeah I get the feeling that this is almost entirely hype based. AI is all the rage right now so this lets AMD do a press release that basically says see we're selling more AI related thingies. Yay. It's fine if you want to make money on AI but if that's your goal you probably need to convince more hyperscalers to fork over the big money for your data center GPUs. So far they're making some progress on that front, but a lot less than I think most of their investors were expecting.

I also think their AI and GPU focus has led AMD to kind of take their eye off the ball a little on CPUs and let Intel back into the game. Intel still exists. They're not going to just roll over and die. They seem to actually be producing some good CPUs now. I think for the past year or so AMD has been focused almost exclusively on competing with Nvidia on the AI front which is mostly GPU focused rather than on Intel. I also think that just like Nvidia they have mostly ignored the consumer segment. It's an afterthought after they're done with whatever they're doing for AI.

As an investor I guess I"m fine with that. I've made an absurd amount of money on Nvidia. As a consumer I think we're in for a year or two of dark times without really any compelling upgrade options.
 
Upvote
3 (3 / 0)
According to techpowerup AMD are announcing non-PRO AI versions as well.

Screenshot at 2026-03-02 18-48-30.png


and also (from a different page):

Techpowerup said:

To carve out the Ryzen AI 7 450 series processor models, AMD configured the silicon with 4 "Zen 5" and 4 "Zen 5c" cores. The Ryzen AI 5 440 series chips are configured with 3 "Zen 5" and 3 "Zen 5c" cores. The Ryzen AI 5 435 series chips come with 2 "Zen 5" and 4 "Zen 5c" cores
 
Upvote
4 (4 / 0)

lolnova

Ars Scholae Palatinae
1,059
According to techpowerup AMD are announcing non-PRO AI versions as well.

View attachment 129471

and also (from a different page):
So they sacrificed a ton of potentially useful (meaning, full Zen 5 cores, and/or GPU compute units to match or exceed the 8700G [edit: and/or an increase in L3 over 5000G, ffs]) die area for that fucking NPU.

I'm so sick of C-suite dolts high on their own genAI farts. STOP RUINING THINGS.
 
Last edited:
Upvote
0 (1 / -1)

Fatesrider

Ars Legatus Legionis
25,280
Subscriptor
Can't wait for the day marketing stops using the 'AI' moniker...
The Kook-Aid has been flowing among the corporate tech world, but if you peel off the layer of hype, the hysteria is not far below it.

The problem is that AI STILL has not made a profit. And as far as I can tell, thus far, there's no way to make it profitable. That has NOT changed.

But so much cash has been thrown at this bullshit that when it implodes, it's taking most of the tech hardware world with it. All those deals with the tech companies will suck the revenue out of them, sooner or later, with it being tossed into the literal fiscal furnace AI is fueled on. It can't make a profit, so it can never pay back the (at least $1.5 trillion that's been thrown at it or possibly $2+ trillion total that's its increased to now from all the tech deals) investment that's been put into it.

That cash went into data centers and power stations, that have no useful purpose because of their means of production or design, other than burning cash to "power AI".

So, when that happens, the AI moniker will be pretty much viewed with the same love and affection as the swastika is today, because the global depression brought about by the collapse of AI will be destabilizing to everyone around the world and most impactful on those who bought into it in the first place.
 
Upvote
6 (7 / -1)

evan_s

Ars Tribunus Angusticlavius
7,427
Subscriptor
Click to Do has been pretty useful. Similar to Google's circle to search.

I think NPUs will be useful for offloading "AI" OCR, voice recognition, and video processing. Not every use of "AI" needs to be LLM chatbots, and Microsoft could integrate things like better OCR into search and provide actually useful improvements.

Yeah. I think on windows this issue is largely software. I believe Teams can use the NPU for virtual backgrounds and/or background noise removal for the audio but it's the only thing that does that. Nvidia has NVIDIA Broadcast which is their own app to use their AI cores on their GPUs to accomplish those same types of things. That could be available on NPUs or other brands GPUs but it's kept as an exclusive feature in software.

On Apple the NPU gets used for quite a few things first party and probably even gets used by third party apps automatically because Apple has built api's that use the best available compute resource. Most of it isn't the AI chat bots or image generation that makes the headlines in the new for AI but it is genuinely useful stuff.
 
Upvote
8 (9 / -1)

42Kodiak42

Ars Scholae Palatinae
1,439
There's one thing that I think is a damn shame with all of this AI business and NPUs in particular: I'm very much put off from installing NPUs in my computer now despite the fact that I now regularly use an application that might utilize them (Bigscreen beyond's eyetracking, which uses an ML model to process eye positions from a camera in the HMD's form factor, although I don't know if it's actually set up to use an NPU).

But unfortunately, Microsoft have had such exceedingly bad ideas for built in AI software that I'm hesitant to even hook up that kind of hardware to a Windows machine.
 
Upvote
2 (2 / 0)

DanNeely

Ars Legatus Legionis
16,117
Subscriptor
Given the recent MS new that they banned Microslop on their Discord then shut it down. We need a simliar rebrand of Copilot to express our disdain.

I vote CoPlop from Microslop to really drive the knife in.

I've been snarking CoPilo💩 for a while; but haven't managed to get any traction for it.
 
Upvote
1 (2 / -1)

Thegs

Ars Scholae Palatinae
907
Subscriptor++
Since it looks like we're going to be saddled with NPUs in our hardware from now on, is there anything useful (i.e. NOT AI) that they can be used for?
The singular use case that has impressed me is generative fill in photo editing tools. The fill tool is a lot faster and more accurate on Paint running on my ARM Surface laptop than Photoshop CS 6 ever was running on my x86 desktop. I believe the generative fill feature of Paint is only available if you have an NPU in your CPU.
 
Upvote
2 (2 / 0)
Since it looks like we're going to be saddled with NPUs in our hardware from now on, is there anything useful (i.e. NOT AI) that they can be used for?
Apple's been using them for loads of stuff since they first added them to the iPhone back in 2017. Face ID was the first main use, but for instance Metal 4 uses the NPU to do DLSS instead of having the GPU do it - since you probably aren't using the NPU to do much anyway when playing a game. But x86 won't be able to do that since you have the GPU and NPU on opposite sides of a slow PCI bus and then have to ship the output back across it because the monitor is back on the other side again.

Apple uses it for voice isolation in things like FaceTime, the OS feature that does OCR on all images allowing you to search and copy text out of photos, and so on. Basically, it's on the OS to make use of that compute, ideally through frameworks and APIs that apps would normally utilize because most developers aren't going to make their own tailored AI models for object detection, etc. Windows sits sort of between MacOS and Linux in terms of providing those kinds of standardized services, so unless we start to get some pretty interesting changes to GNU tools, I'm guessing linux users won't see much from it, and Microsoft seemed to go really hard on LLM use and not so much on the other stuff.

Like, you need to clarify what 'not AI' means. FaceID is AI. Is that off limits? All modern OCR is AI. Is that off limits?
 
Last edited:
Upvote
12 (13 / -1)

SraCet

Ars Legatus Legionis
17,007
The NPU is weaker, but supposedly far more efficient for the imagined use case. The idea is that an integrated NPU is able to perform a relative (basic) LLM/inference ...
Err, why not use the NPU for other ML stuff? Why are we discussing LLMs here? Running an LLM on an NPU is kinda dumb for all the reasons that you explained so well. So why even consider it.
 
Upvote
3 (4 / -1)

khumak50

Ars Tribunus Militum
1,573
Err, why not use the NPU for other ML stuff? Why are we discussing LLMs here? Running an LLM on an NPU is kinda dumb for all the reasons that you explained so well. So why even consider it.
If they really wanted to ramp up AI use for local models on desktops and laptops, what they really need to do IMO is develop a new memory interface connecting to both the CPU and discrete GPU that allows for a unified memory architecture. So your 5090 would come with zero memory and would just use whatever system ram you have. That might be 16GB for a budget system or it might be 256GB for a high end system. THAT would allow running local LLM models big enough to actually be good even on lower performance GPUs. The performance of your GPU would basically determine how fast you got your result while the amount of memory would determine which model you could run and those 2 things would be unrelated to each other. I don't really see a situation where an NPU is relevant for an AI model big enough to give good results. Can it run copilot? Probably. Is Copilot worth using? Not from what I've noticed.

I don't really expect this to happen though because this would potentially canabalize data center GPU sales if you could slap 500+GB of system memory into a desktop and run a data center sized LLM model on a much cheaper 5090. Slower sure, but $3-5k for a 5090 is a lot cheaper than $30-50k for a data center Blackwell GPU.
 
Upvote
2 (2 / 0)
I mean there could be also lots of useful AI. The issue is really the software stack, how much RAM these NPUs can access, and how fast that RAM is. I would really like to see better AI in games to make NPCs more believable, or just AI enemies ...be better.

For Christ's sake, AoE IV's AI is NOT that much better than AoE 1's... just play a map with a puddle and it will build the Spanish Armada in it. Is MS was serious about pushing NPUs, I bet there would be a lot they could do in their own games to make good use of them.

Instead, they decided to take screenshots of my password and credit card details. I guess that... I will avoid NPUs then? 🤷‍♂️
People seem to hate this line of reasoning, but the only upgrade I could see to justify a PS6 over the 5 is, ironically, a beefy ass NPU.

Sony is well placed to establish the "Direct X" of AI. Developers load in their standardised weights and bases, and boom!

Genuinely performant AI, that sidesteps all current AI ethical issues (no theft, not taking any jobs, no risk of a singularity) and consumer issues (actually does something useful)
 
Upvote
-1 (2 / -3)
For the folks questioning why you'd want an NPU machine over a dedicated video card, it's all about that big pool of RAM to run much larger AI models locally. People aren't buying these things to game on, nor are they buying them for speed. Nor are they buying them for Copilot+, no matter what microsoft pays manufacturers to put on the outside of the box. Most of the people on this site are allergic to anything AI so this obv. doesn't appeal to them, but these NPU devices have been fantastic for running home compute without having to pay the Nvidia tax.
 
Upvote
1 (2 / -1)
If they really wanted to ramp up AI use for local models on desktops and laptops, what they really need to do IMO is develop a new memory interface connecting to both the CPU and discrete GPU that allows for a unified memory architecture. So your 5090 would come with zero memory and would just use whatever system ram you have. That might be 16GB for a budget system or it might be 256GB for a high end system. THAT would allow running local LLM models big enough to actually be good even on lower performance GPUs. The performance of your GPU would basically determine how fast you got your result while the amount of memory would determine which model you could run and those 2 things would be unrelated to each other. I don't really see a situation where an NPU is relevant for an AI model big enough to give good results.
That's literally what Apple did. And MLX will split up tasks between GPU and NPU depending on which is more suitable or use both. NPUs are still more efficient at those kinds of things computationally, they just generally don't have the memory bandwidth that the GPU has. Fix that bandwidth problem, and the NPU will be pretty clearly better. Apple CPUs are dual channel for the base, 4x for Pro, 8x for Max, and 16x for Ultra with all cores having equal access. That's why they have the option to use the NPU for DLSS, since the GPU just passes it a pointer, and the NPU just passes one back. No need to copy over a PCI bus.
 
Upvote
7 (8 / -1)
I mean there could be also lots of useful AI. The issue is really the software stack, how much RAM these NPUs can access, and how fast that RAM is. I would really like to see better AI in games to make NPCs more believable, or just AI enemies ...be better.
That's not a problem with NPU RAM access or speed. It's a function of game developers either not wanting to spend the time to develop an AI model trained on the game, or more likely, them not yet having time to do it. A AAA game has about a 6 year development cycle. The first public version of ChatGPT was less than 3 ½ years ago. You're not going to see the first AAA designed with proper AI in mind for another 3 years.
 
Upvote
2 (2 / 0)

zogus

Ars Tribunus Angusticlavius
7,261
Subscriptor
I knew a guy that was experimenting with (what was then called) machine learning and games a decade ago. It independently adapted to changes in both player tactics and game settings. E.g. adjust a rifle's stats, and it would use it more or less depending on effectiveness.

The problem was that people don't actually want it. It behaved optimally, not realistically. The AI would learn to aggressively min-max their playstyle. Imagine a pro player that leaned hard on cheese, that was supernaturally able to time exploits. It was hard to tune it down too. Make it too dumb to figure out the exploit and it would also fail to understand what the player was doing and how to respond. It was computationally cheaper, and easier to tune, with conventional heuristic methods.

Models are larger these days, but I think the underlying issue remains the same: LLM's are plausible because people just see the end product. Interactive game ML/AI immerses people in how AI makes the proverbial sausage and the steps it takes to get to end will break the suspension of disbelief.
This doesn’t sound like a fundamental problem with machine-learned game AI. Rther, the statement that it was performing “optimally” makes it sound like your acquaintance was simply training it against the wrong goal. After all, maximizing performance is rarely the ultimate objective for the computer player in any game; rather, it’s being just good enough to give the human player a challenge.
 
Upvote
2 (2 / 0)
For the folks questioning why you'd want an NPU machine over a dedicated video card, it's all about that big pool of RAM to run much larger AI models locally. People aren't buying these things to game on, nor are they buying them for speed. Nor are they buying them for Copilot+, no matter what microsoft pays manufacturers to put on the outside of the box. Most of the people on this site are allergic to anything AI so this obv. doesn't appeal to them, but these NPU devices have been fantastic for running home compute without having to pay the Nvidia tax.
It's a big pool of RAM but on any consumer PC it's a slow pool of RAM. You're around 100GB/s, vs 1.5TB/s on the GPU - or higher. PCs have a couple of serious tradeoffs they need to solve here - either move the GPU on package to eliminate the PCI bottleneck, or increase memory bandwidth potentially at the loss of expandability. That's why Apple did both of them, allowing for a unified memory space, much higher memory bandwidth and no penalty to copying data from CPU to GPU. Now, it came at the expense of not being able to upgrade your GPU or RAM, but you have hardware that can run much larger models locally than you can on x86 (because the limited RAM on the GPU incurs such a cost to copy across PCI) because you have that big pool, and on something like a Mac Studio, you have 8x the memory bandwidth of these announced products.
 
Upvote
1 (3 / -2)

MechR

Ars Praefectus
3,246
Subscriptor
I could see some NPU utility for image-gen prompt-encoding, if only AMD would implement ComfyUI integration and/or native safetensors support, instead of doing their own loginwalled onnx-based thing totally separate from ROCm. Until then, it basically is useless to me, if not worse, since Windows kept a bunch of RAM-hogging background AI processes sitting around idle until I disabled them.
 
Upvote
1 (1 / 0)

khumak50

Ars Tribunus Militum
1,573
That's literally what Apple did. And MLX will split up tasks between GPU and NPU depending on which is more suitable or use both. NPUs are still more efficient at those kinds of things computationally, they just generally don't have the memory bandwidth that the GPU has. Fix that bandwidth problem, and the NPU will be pretty clearly better. Apple CPUs are dual channel for the base, 4x for Pro, 8x for Max, and 16x for Ultra with all cores having equal access. That's why they have the option to use the NPU for DLSS, since the GPU just passes it a pointer, and the NPU just passes one back. No need to copy over a PCI bus.
Yeah if they tweak things so that the NPU and GPU can work together then I can see an NPU being a value add. But not if it's either or. And yeah Apple is kind of the proof of concept. If they can do it then obviously it can be done. That would open up options for better customization of PCs, especially desktops where it's easier to mix and match parts. A gamer might want a 5090 and only bother with 32GB of system memory. Someone wanting to play around with AI on a shoestring budget might opt for a 5070 but spring for 128GB of unified memory allowing them to make use of larger models even if performance was a bit lower.

Nvidia is kind of already opening things up like that on the data center side with their NVlink fusion option. Where's the desktop version of that?
 
Upvote
2 (2 / 0)
I've yet to meet someone who actually wants a Copilot+ system. Like, I guess in theory they must exist somewhere. But I haven't seen one in person.

I have. Several people, in fact. People whom I respected and valued their opinions, until they mentioned to be in favor of these things. They even parrot the 'adapt or die' spiel...
Yeah.
 
Upvote
1 (1 / 0)

khumak50

Ars Tribunus Militum
1,573
I've yet to meet someone who actually wants a Copilot+ system. Like, I guess in theory they must exist somewhere. But I haven't seen one in person.
There's 2 problems with Copilot IMO.

First problem is Copilot specific. It's just not very good. It consistently provides very little value if you try to use it. Maybe it's just a badly designed AI model. Maybe it's more of a hardware limitation. If it's designed to run on an NPU maybe it automatically sucks because it can't run a large enough model. If I ask an identical question to Copilot and Gemini I will probably get the wrong answer from Copilot and I'll probably get the right one from Gemini. If I know the answer I'm getting from Copilot is usually wrong, what's the point in using it. I just have to double check everything myself anyway.

Second problem is not specific to Copilot, it's more an issue of context for any LLM model. Even a good model is only as good as the information you give it. Microsoft actually recognized this early on with Copilot and tried to screen shot everything you do and literally catalog your every move in plain text for every script kiddie in the world to hack. Thankfully there was a certain "resistance" to that approach. A local AI model that knows everything about me would potentially be quite useful. But it would also be an unacceptable liability.

Realistically if I want an AI agent to help me with my finances for instance, it would be a lot easier for it to be useful to me if it knew literally everything about my finances. Problem is how do you allow that without a giant security hole that some Nigerian prince will drive a truck through? For a local model I could download bank/brokerage statements and feed them to it but I'm not doing that for a cloud model.

I suspect a lot of that will eventually be handled by building LLM model functions into existing applications that have security functions built in. So whatever security measures you already accept for Turbotax for instance will apply for whatever AI model Intuit integrates into it at some future date (did they already do it? No clue).
 
Upvote
4 (4 / 0)

SraCet

Ars Legatus Legionis
17,007
If they really wanted to ramp up AI use for local models on desktops and laptops, what they really need to do IMO is develop a new memory interface connecting to both the CPU and discrete GPU that allows for a unified memory architecture. So your 5090 would come with zero memory and would just use whatever system ram you have. That might be 16GB for a budget system or it might be 256GB for a high end system. THAT would allow running local LLM models big enough to actually be good even on lower performance GPUs. The performance of your GPU would basically determine how fast you got your result while the amount of memory would determine which model you could run and those 2 things would be unrelated to each other. I don't really see a situation where an NPU is relevant for an AI model big enough to give good results. Can it run copilot? Probably. Is Copilot worth using? Not from what I've noticed.

I don't really expect this to happen though because this would potentially canabalize data center GPU sales if you could slap 500+GB of system memory into a desktop and run a data center sized LLM model on a much cheaper 5090. Slower sure, but $3-5k for a 5090 is a lot cheaper than $30-50k for a data center Blackwell GPU.
Once you start ramping up GPU compute power, you quickly reach the point where LLM inference is limited by memory bandwidth.

So, no, you can't just connect a 5090 to some sticks of DDR5 and call it a day.

There's a reason why the Nvidia H100s of the world have their GPU chips connected to massive amounts of HBM memory.
 
Upvote
3 (3 / 0)

SraCet

Ars Legatus Legionis
17,007
If it's similar to Strix Point, it looks roughly comparable to about 3.5x full-fat Zen 5 cores, or 4x RDNA 3.5 WGPs (so 8X CUs), or 16MB of L3 cache.

https://www.techpowerup.com/325035/amd-strix-point-silicon-pictured-and-annotated
If those annotations are correct, I'm surprised.

Looks like their NPU is quite a big larger than Apple's, despite similar advertised performance (TOPS).

But still, it only amounts to 7% of the die area. I don't think AMD is charging anybody extra for that 7%.
 
Upvote
-1 (0 / -1)
If those annotations are correct, I'm surprised.

Looks like their NPU is quite a big larger than Apple's, despite similar advertised performance (TOPS).

But still, it only amounts to 7% of the die area. I don't think AMD is charging anybody extra for that 7%.
Part of the cost of chiplets. They give you versatility but costs you die area.
 
Upvote
-1 (0 / -1)

Voldenuit

Ars Tribunus Angusticlavius
6,765
That's not a problem with NPU RAM access or speed. It's a function of game developers either not wanting to spend the time to develop an AI model trained on the game, or more likely, them not yet having time to do it. A AAA game has about a 6 year development cycle. The first public version of ChatGPT was less than 3 ½ years ago. You're not going to see the first AAA designed with proper AI in mind for another 3 years.
Several indie games have already experimented with AI for NPCs.

Where The Winds Meet uses AI LLMs for some minor NPC dialogue, and for affinity tracking and minor quest generation. Crucially, it is only used for "minor" NPCs with quite strict limits (so you can't break immersion by asking them for glue on pizza recipes), and they also navigated the ethics issue by keeping human voice actors for voiced dialogue.

https://allthings.how/where-winds-meet-and-ai-how-the-mmos-chatbot-npcs-work/

AAA studios tend to be a lot slower in implementing AI, and they also tend to misread the room and want to use it to fire devs and steal art without any self awareness of the inevitable consumer blockback. Probably because decisions tend to get made at the executive level and then filtered down to the working stiffs.
 
Upvote
1 (2 / -1)