AMD will bring its “Ryzen AI” processors to standard desktop PCs for the first time

Status
You're currently viewing only khumak50's posts. Click here to go back to viewing the entire thread.

khumak50

Ars Tribunus Militum
1,573
I'm not sold on the use case for an NPU in a desktop. For a mobile device without a discrete GPU I can see it being useful for running AI functions that don't require much performance, but for a desktop only the very bottom tier of desktops are likely to lack a discrete GPU for most consumers.

Last I heard NPUs are FAR weaker at AI tasks than pretty much any GPU. So the only use case I could see would be running a local AI model that requires more memory than the VRAM on your GPU. But for a local model with that sort of demand, will an NPU have enough performance to even be relevant? Can I run that 64GB local model that my 5090 can't run due to memory constraints on my NPU if I have 64GB of system ram? I tend to doubt it. (I don't have a 5090, just making my point).

So from my perspective either an NPU can run a bigger model than a GPU if you have enough system ram in which case it's potentially relevant or it can't in which case it's a waste of silicon that I would rather not pay for.

Are they planning to continue selling desktop CPUs without an NPU? If so then ok non issue. If not then we're looking at AMD raising prices for wasted silicon.
 
Upvote
-5 (4 / -9)

khumak50

Ars Tribunus Militum
1,573
The NPU is weaker, but supposedly far more efficient for the imagined use case. The idea is that an integrated NPU is able to perform a relative (basic) LLM/inference -related task more efficiently than the usual "APU" design of CPU + iGPU. On paper, sure, this checks out. In reality... I think there is little utility for the average consumer or business in the imagined scenario of accelerating compute for a small local model efficiently.

Why is it useless? Well, the tiny local models that could make sense would have to be highly-tailored for any real utility at a size that can fit (like an open 7B model or less). A story by the Economist last month covered this and explained (with survey data) that only a small number of workers use "AI" daily and that the vast majority of these users only use LLMs via cloud providers and as a glorified search / reference system. The tiny local models which fit within 4-16gb are terrible for this particular task - few are web-search/agent enabled, plus small models lack context and hallucinate more.

So the stated goal of power efficiency for a local model is all but pointless when the average model that might actually provide utility is larger than the device can realisticlaly run. Generally speaking, I think you are correct in that NPUs are wasted silicon. This funcitionality would make more sense baked into a power-hungry iGPU on all of these processors - even if that scenario means less efficiency for the rare user that actually needs the NPU functionality. At least the silicon would be more likely to be utilized in that scenario.
Yeah I get the feeling that this is almost entirely hype based. AI is all the rage right now so this lets AMD do a press release that basically says see we're selling more AI related thingies. Yay. It's fine if you want to make money on AI but if that's your goal you probably need to convince more hyperscalers to fork over the big money for your data center GPUs. So far they're making some progress on that front, but a lot less than I think most of their investors were expecting.

I also think their AI and GPU focus has led AMD to kind of take their eye off the ball a little on CPUs and let Intel back into the game. Intel still exists. They're not going to just roll over and die. They seem to actually be producing some good CPUs now. I think for the past year or so AMD has been focused almost exclusively on competing with Nvidia on the AI front which is mostly GPU focused rather than on Intel. I also think that just like Nvidia they have mostly ignored the consumer segment. It's an afterthought after they're done with whatever they're doing for AI.

As an investor I guess I"m fine with that. I've made an absurd amount of money on Nvidia. As a consumer I think we're in for a year or two of dark times without really any compelling upgrade options.
 
Upvote
3 (3 / 0)

khumak50

Ars Tribunus Militum
1,573
Err, why not use the NPU for other ML stuff? Why are we discussing LLMs here? Running an LLM on an NPU is kinda dumb for all the reasons that you explained so well. So why even consider it.
If they really wanted to ramp up AI use for local models on desktops and laptops, what they really need to do IMO is develop a new memory interface connecting to both the CPU and discrete GPU that allows for a unified memory architecture. So your 5090 would come with zero memory and would just use whatever system ram you have. That might be 16GB for a budget system or it might be 256GB for a high end system. THAT would allow running local LLM models big enough to actually be good even on lower performance GPUs. The performance of your GPU would basically determine how fast you got your result while the amount of memory would determine which model you could run and those 2 things would be unrelated to each other. I don't really see a situation where an NPU is relevant for an AI model big enough to give good results. Can it run copilot? Probably. Is Copilot worth using? Not from what I've noticed.

I don't really expect this to happen though because this would potentially canabalize data center GPU sales if you could slap 500+GB of system memory into a desktop and run a data center sized LLM model on a much cheaper 5090. Slower sure, but $3-5k for a 5090 is a lot cheaper than $30-50k for a data center Blackwell GPU.
 
Upvote
2 (2 / 0)

khumak50

Ars Tribunus Militum
1,573
That's literally what Apple did. And MLX will split up tasks between GPU and NPU depending on which is more suitable or use both. NPUs are still more efficient at those kinds of things computationally, they just generally don't have the memory bandwidth that the GPU has. Fix that bandwidth problem, and the NPU will be pretty clearly better. Apple CPUs are dual channel for the base, 4x for Pro, 8x for Max, and 16x for Ultra with all cores having equal access. That's why they have the option to use the NPU for DLSS, since the GPU just passes it a pointer, and the NPU just passes one back. No need to copy over a PCI bus.
Yeah if they tweak things so that the NPU and GPU can work together then I can see an NPU being a value add. But not if it's either or. And yeah Apple is kind of the proof of concept. If they can do it then obviously it can be done. That would open up options for better customization of PCs, especially desktops where it's easier to mix and match parts. A gamer might want a 5090 and only bother with 32GB of system memory. Someone wanting to play around with AI on a shoestring budget might opt for a 5070 but spring for 128GB of unified memory allowing them to make use of larger models even if performance was a bit lower.

Nvidia is kind of already opening things up like that on the data center side with their NVlink fusion option. Where's the desktop version of that?
 
Upvote
2 (2 / 0)

khumak50

Ars Tribunus Militum
1,573
I've yet to meet someone who actually wants a Copilot+ system. Like, I guess in theory they must exist somewhere. But I haven't seen one in person.
There's 2 problems with Copilot IMO.

First problem is Copilot specific. It's just not very good. It consistently provides very little value if you try to use it. Maybe it's just a badly designed AI model. Maybe it's more of a hardware limitation. If it's designed to run on an NPU maybe it automatically sucks because it can't run a large enough model. If I ask an identical question to Copilot and Gemini I will probably get the wrong answer from Copilot and I'll probably get the right one from Gemini. If I know the answer I'm getting from Copilot is usually wrong, what's the point in using it. I just have to double check everything myself anyway.

Second problem is not specific to Copilot, it's more an issue of context for any LLM model. Even a good model is only as good as the information you give it. Microsoft actually recognized this early on with Copilot and tried to screen shot everything you do and literally catalog your every move in plain text for every script kiddie in the world to hack. Thankfully there was a certain "resistance" to that approach. A local AI model that knows everything about me would potentially be quite useful. But it would also be an unacceptable liability.

Realistically if I want an AI agent to help me with my finances for instance, it would be a lot easier for it to be useful to me if it knew literally everything about my finances. Problem is how do you allow that without a giant security hole that some Nigerian prince will drive a truck through? For a local model I could download bank/brokerage statements and feed them to it but I'm not doing that for a cloud model.

I suspect a lot of that will eventually be handled by building LLM model functions into existing applications that have security functions built in. So whatever security measures you already accept for Turbotax for instance will apply for whatever AI model Intuit integrates into it at some future date (did they already do it? No clue).
 
Upvote
4 (4 / 0)
Status
You're currently viewing only khumak50's posts. Click here to go back to viewing the entire thread.