Google’s new Gemma 4 open AI model is sized for your laptop

AI_Skeptic · Wednesday at 3:36 PM

neodorian said:
Nah, I'll pass. My laptop is from 2014 and 16GB is all the RAM in it.

And despite the headline specifying "laptop", I still don't wanna dedicate half of my desktop's RAM to it either. It's not as if I can just go out and buy two more sticks to fill the empty slots without remortgaging the house...

You'd need a somewhat modern graphics card as well, and a 2014 laptop won't cut it (probably).

WonderSteve · Wednesday at 3:36 PM

Is it possible to configure these local models to utilize the "NPU" on AMD or Intel CPUs?

AI_Skeptic · Wednesday at 3:37 PM

Fred Duck said:
Pretty average consumer laptops have 16GB RAM? I can't imagine what the average Ars reader must have. o_o

I would have tried it for this comment but I haven't 18GB space free because I'm recording tutorials! NOT because I'm THAT bad at managing drive space! Really! It's true!

Until recently, Apple was selling laptops that had a baseline of 16GB of RAM. I picked up an M5 MacBook Pro for sale that had 24GB of RAM.

CatNamedHugs · Wednesday at 3:38 PM

I'll download this just in case my space heater stops working over the winter and need my laptop to replace it.

Lexus Lunar Lorry · Wednesday at 3:38 PM

Google says Gemma 4 12B is unique in that it can run on many consumer laptops without sacrificing quality. As long as you’ve got a computer with 16GB of system RAM or VRAM, the 12-billion-parameter model will work.

I wonder how the OpenAI and Anthropic IPOs will go once Wall Street learns about the onward march of local/open-weight LLMs?

UserIDAlreadyInUse · Wednesday at 3:39 PM

So, just what does Google get for releasing these models that run locally? The effort, the development time, the maintenance for an LLM that runs on someone's system, sends no data to Google, no fingerprinting, no means to target advertisements even closer, generates no revenue?

Ahabba · Wednesday at 3:46 PM

I just ran it on my MacBook Pro with 48GB of RAM. I gave it a random, challenging macro image of fairly low quality. The 12B model didn't really do well, saying it was a photo of bread crumbs or fried food. But the 26B model I also have in LM-Studio actually picked up on this being 2 ladybug larvae. Pretty impressive, and useful for an upcoming image sorting task I have.

Edit: updated image to jpg

Ronin_48 · Wednesday at 3:46 PM

UserIDAlreadyInUse said:
So, just what does Google get for releasing these models that run locally? The effort, the development time, the maintenance for an LLM that runs on someone's system, sends no data to Google, no fingerprinting, no means to target advertisements even closer, generates no revenue?

Same as Microsoft never seriously going after Windows pirates - you want to be the default choice for the technology and to make money from companies that hire people already familiar with your tools. And of course you lock the really nice features behind the enterprise subscriptions.

Fatesrider · Wednesday at 3:47 PM

Okay, I'm putting on my stupid face and asking, why specifically "laptop"?

Seems to me it SHOULD run on a desktop, too, since most desktops have better hardware than most laptops, yes? What am I missing here?

Martin Blank · Wednesday at 3:47 PM

AI_Skeptic said:
I mean, that's a great question. Google is a huge company, and I think they are making the same mistake as OpenAI made. They are releasing their models under the Apache 3.0 licensing standards because they have a bunch of academia working for them, and they are used to sharing information. Google is so large that I don't think the right hand knows what the left hand is doing.

I don't know about that. They see AI becoming a commodity, running whatever model anywhere, and they're staying pretty neutral about what runs on their stuff. People build on them, they make money. Maybe they make a little more on their own stuff (e.g., Gemini), but possibly they make more long-term not trying to corner the market.

wildsman · Wednesday at 3:47 PM

UserIDAlreadyInUse said:
So, just what does Google get for releasing these models that run locally? The effort, the development time, the maintenance for an LLM that runs on someone's system, sends no data to Google, no fingerprinting, no means to target advertisements even closer, generates no revenue?

Developer mindshare - it doesn't want devs to use Qwen or LLama to run models locally - it wants its own models to dominate in that space as well. It also helps them recruit researchers and attract talent.

I'm sure there are a bunch of other reasons as well but these are ones I could think off the top of my head...

quamquam quid loquor · Wednesday at 3:48 PM

This is amazing. Gemma 4 26B has been my OCR workhorse, so excited to put 12B through the paces and see how it performs. It's not on openrouter yet, so will have to wait a day or two to see pricing comps.

nobirth · Wednesday at 3:50 PM

If you have an M-series MacBook Pro, this will run well. I’m pretty sure an old 14” M1 goes for about $500 used. If this model could fill the gap between the E4B model and the larger 26B MOE, that would be awesome (E4B was interesting for me, but wasn’t that useful, whereas 26B handles 95% of my AI use right now).

Also, a lot of these new releases are about building smaller models that capably support agents, so I assume that’s part of the gap this is filling.

Re: why release it, Google wants to own every step of your AI infrastructure, from big paid licenses all the way to small local integrations. This gets to reliability with tools, consistency of results, etc.

quamquam quid loquor · Wednesday at 3:51 PM

Ronin_48 said:
Same as Microsoft never seriously going after Windows pirates - you want to be the default choice for the technology and to make money from companies that hire people already familiar with your tools. And of course you lock the really nice features behind the enterprise subscriptions.

Microsoft also doesn't go after cross-region which is really nice. You can buy legitimate OEM windows license stickers off alibaba for $10 each.

UserIDAlreadyInUse · Wednesday at 3:51 PM

Fatesrider said:
Okay, I'm putting on my stupid face and asking, why specifically "laptop"?

Seems to me it SHOULD run on a desktop, too, since most desktops have better hardware than most laptops, yes? What am I missing here?

My guess would be to get the association with "mobility" firmly planted in people's heads for these models. Today it's the laptop, tomorrow it's the smartwatch, Android phone, and other wearables.

quamquam quid loquor · Wednesday at 3:54 PM

UserIDAlreadyInUse said:
My guess would be to get the association with "mobility" firmly planted in people's heads for these models. Today it's the laptop, tomorrow it's the smartwatch, Android phone, and other wearables.

For the other models you need an array of used 3090s or unified memory. This comes out to $1,000-$2,000 of additional expense you would otherwise not incur. "Free" models aren't free to most consumers due to additional hardware requirements, but this one is.

AI_Skeptic · Wednesday at 3:54 PM

Martin Blank said:
I don't know about that. They see AI becoming a commodity, running whatever model anywhere, and they're staying pretty neutral about what runs on their stuff. People build on them, they make money. Maybe they make a little more on their own stuff (e.g., Gemini), but possibly they make more long-term not trying to corner the market.

But as I understand it, these are locals models, not remote models. How would Google make money off the local models?

Resistance · Wednesday at 3:54 PM

This is almost exactly what I've been looking for, I have exactly 16 GB of VRAM and choosing between models that fit in my VRAM and run fast but are poor quality or models that only have partial GPU offload, run slow but are higher quality.

I'm curious if better performance or quality could be had if the model was trained with similar constraints but removing the vision and tool use aspects. I don't need either of those for my applications and I wonder if there is waste for

Abulia · Wednesday at 3:55 PM

Man, could you slap this in a Raspberry Pi and be talking with JARVIS? Gotta finish that 3d printed Iron Man cosplay…

quamquam quid loquor · Wednesday at 3:56 PM

AI_Skeptic said:
But as I understand it, these are locals models, not remote models. How would Google make money off the local models?

Probably pretty convoluted, but in theory increasing use of openrouter-style systems would increase demand for their TPUs, which leads to increased revenue from hardware sales.

Anyway, free software is something to celebrate.

Resistance · Wednesday at 3:57 PM

neodorian said:
Nah, I'll pass. My laptop is from 2014 and 16GB is all the RAM in it.

And despite the headline specifying "laptop", I still don't wanna dedicate half of my desktop's RAM to it either. It's not as if I can just go out and buy two more sticks to fill the empty slots without remortgaging the house...

Loading and unloading models is pretty quick, unless you're using it constantly you'll be fine. Certainly better than burning money on cloud LLMs.

quamquam quid loquor · Wednesday at 3:59 PM

UserIDAlreadyInUse said:
My guess would be to get the association with "mobility" firmly planted in people's heads for these models. Today it's the laptop, tomorrow it's the smartwatch, Android phone, and other wearables.

Amazon got screwed on ridiculous cloud computing expense for Alexa devices just a few years ago.

When every amazon echo has enough computing power to run a near-frontier class model like gemma, the cost is now decentralized and distributed to users. It unlocks the economics of the smart-assistant. This will be adopted by all the "smart" speakers in the next 4 years.

HamHands_ · Wednesday at 3:59 PM

UserIDAlreadyInUse said:
So, just what does Google get for releasing these models that run locally? The effort, the development time, the maintenance for an LLM that runs on someone's system, sends no data to Google, no fingerprinting, no means to target advertisements even closer, generates no revenue?

Same thing that investing in Chrome bought them. They want to control the space so that a decade later they can extract money when people are too deeply invested to leave.

With Chrome and the web they built a browser that unseated IE's monopoly and now it's the #1 browser in the world. It's the lens by which most people interact with the web. Now they are abusing that dominance to support their actual business which is selling user data, ostensibly for targeted advertisements but realistically for any willing buyer.

ETA: Tbc, we're in the honeymoon phase so I'm not saying there's any risk to using this model locally.

Martin Blank · Wednesday at 4:01 PM

AI_Skeptic said:
But as I understand it, these are locals models, not remote models. How would Google make money off the local models?

There are plenty of people that are running models essentially locally on virtualized platforms. This keeps the processing local for sensitive workloads where even contractual promises that an AI company won't use it for training aren't enough. At my company, especially with usage-based pricing becoming so popular among AI companies, they're looking at it as a way to control costs for certain RAG and MCP tasks, and to potentially to limit costs around agent use.

ranthog · Wednesday at 4:05 PM

HamHands_ said:
Same thing that investing in Chrome bought them. They want to control the space so that a decade later they can extract money when people are too deeply invested to leave.

With Chrome and the web they built a browser that unseated IE's monopoly and now it's the #1 browser in the world. It's the lens by which most people interact with the web. Now they are abusing that dominance to support their actual business which is selling user data, ostensibly for targeted advertisements but realistically for any willing buyer.

ETA: Tbc, we're in the honeymoon phase so I'm not saying there's any risk to using this model locally.

It was Firefox that unseated the IE monopoly. Chrome had nothing to do with that.

Without Firefox there may have never been a Chrome. Google used their monopoly power to gain what is dangerously close to being a monopoly again.

quamquam quid loquor · Wednesday at 4:07 PM

ranthog said:
It was Firefox that unseated the IE monopoly. Chrome had nothing to do with that.

Without Firefox there may have never been a Chrome. Google used their monopoly power to gain what is dangerously close to being a monopoly again.

Edge is chromium, so Chromium effectively has a monopoly again.

sporkinum · Wednesday at 4:07 PM

AI_Skeptic said:
You'd need a somewhat modern graphics card as well, and a 2014 laptop won't cut it (probably).

I didn't see anything in the article that stated that.

jonbob_newcastle · Wednesday at 4:12 PM

quamquam quid loquor said:
Microsoft also doesn't go after cross-region which is really nice. You can buy legitimate OEM windows license stickers off alibaba for $10 each.

Do they still do that thing where they throw in a crappy mouse because you can’t sell an OEM license unless it comes with hardware?

mikewhy · Wednesday at 4:13 PM

nobirth said:
If this model could fill the gap between the E4B model and the larger 26B MOE, that would be awesome (E4B was interesting for me, but wasn’t that useful, whereas 26B handles 95% of my AI use right now).

The Benchmarks here indeed put it between E4B and 26B-A4B: https://huggingface.co/google/gemma-4-12B#benchmark-results

quamquam quid loquor · Wednesday at 4:24 PM

jonbob_newcastle said:
Do they still do that thing where they throw in a crappy mouse because you can’t sell an OEM license unless it comes with hardware?

Nope I get sheets of physical stickers.

neodorian · Wednesday at 4:28 PM

Resistance said:
Loading and unloading models is pretty quick, unless you're using it constantly you'll be fine. Certainly better than burning money on cloud LLMs.

Good thing I don't do that either!

Resistance · Wednesday at 4:33 PM

quamquam quid loquor said:
For the other models you need an array of used 3090s or unified memory. This comes out to $1,000-$2,000 of additional expense you would otherwise not incur. "Free" models aren't free to most consumers due to additional hardware requirements, but this one is.

Which other models are you referring to specifically?

Resistance · Wednesday at 4:37 PM

neodorian said:
Good thing I don't do that either!

If you don't use LLMs why are you complaining about the inability to use this LLM on your hardware?

Hilarius · Wednesday at 4:40 PM

quamquam quid loquor said:
Edge is chromium, so Chromium effectively has a monopoly again.

I sometimes wonder what life is like in the universe where Microsoft chose Gecko instead of Blink. Because this universe has been on a pretty bad trajectory since Edgium was released in 2019.

Google’s new Gemma 4 open AI model is sized for your laptop

Wise, Aged Ars Veteran

Ars Centurion

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Ars Scholae Palatinae

Ars Tribunus Angusticlavius

Ars Centurion

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Tribunus Militum

Ars Tribunus Militum

Ars Tribunus Militum

Seniorius Lurkius

Ars Tribunus Militum

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Wise, Aged Ars Veteran

Ars Tribunus Militum

Ars Centurion

Ars Tribunus Militum

Ars Legatus Legionis

Ars Tribunus Militum

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Centurion

Ars Tribunus Militum

Ars Tribunus Militum

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Ars Centurion