ChatGPT-style search represents a 10x cost increase for Google, Microsoft

Microsoft already has another obvious path to profitability with this, by adding capabilities to Office 365. Would companies be willing to pay a few bucks more per user per month to have access to an AI assistant that knows everything contained in your company's Sharepoint, your OneDrive, and your mailbox, and can provide you with information on demand or compose emails and letters for you with minimal input? I'm guessing that would be an easy sell for most. Forget New Bing, New Clippy could be huge.
I'm not going to lie, being able to ask eg "what does the /v1/frob/buzz API response look like" and not have to trawl our sprawling confluence wiki/JIRA system (or the Google/Microsoft/whatever equivalents) would be incredibly useful. However, it does sound like it would be expensive to build and maintain such a specially trained model, so who knows when that would be a viable product
 
Upvote
8 (8 / 0)
Yes, the whole idea that ML is free/cheap is extremely temporary. Even these $15 a month subs are not cutting it. I even expect the likes of nvidia too to have, dlss 4 or something on subscription.
I was involved in a small scale ML project with ~20000 users, and the amazon bill was going higher and higher. At least as long as we wanted to do a good job. If Microsoft and google have a race for the best ML datacentres the costs will skyrocket. Google wanted to develop their own custom silicon though.... which will be doomed to failure because they microdose so much that they can't keep any project for more than 13 months.
 
Upvote
11 (13 / -2)

idspispopd

Ars Scholae Palatinae
972
The way they are going to make their money back is to have the bots act as a hostile agent, trained to manipulate you however the marketers want. They will learn from you, figure out your mental weaknesses, and exploit them to maximize effectiveness for the advertisers.

If they can get their bots manipulating people effectively enough, they will charge more for bot ads, allowing them to offset the higher cost of delivering them.

From an advertiser perspective, you are ultimately paying for conversions, so there is no issue paying a higher per impression/conversation cost if it converts at a higher enough rate.
 
Upvote
-2 (5 / -7)
Post content hidden for low score. Show…
Unless Bard is very different from ChatGPT, running it on a consumer grade machine, even a high-end gaming PC does not sound practical.

According to Wikipedia, GPT-3's parameters take up 800 gigs.
https://en.wikipedia.org/wiki/GPT-3
The principled thing to do is to decentralise the whole compute calculation to users. For the betterment of society as a whole, less inequality and resiliency against cyber attacks. NO WAY THAT THIS IS GOING TO HAPPEN ... unless the big four make custom silicon and let Nvidia to dry ...

Of course consumer electronics will have to change, mostly in terms of memory ... parallelism .. and networking.

Maybe i am a bit to deep in the world of serve the home, but you know you have your oven, boiler and your rack !
 
Upvote
-7 (2 / -9)
Yes, the whole idea that ML is free/cheap is extremely temporary. Even these $15 a month subs are not cutting it. I even expect the likes of nvidia too to have, dlss 4 or something on subscription.
I was involved in a small scale ML project with ~20000 users, and the amazon bill was going higher and higher. At least as long as we wanted to do a good job. If Microsoft and google have a race for the best ML datacentres the costs will skyrocket. Google wanted to develop their own custom silicon though.... which will be doomed to failure because they microdose so much that they can't keep any project for more than 13 months.
I wouldn't be surprised if there were a lot of low-hanging fruit to optimize in these chat bots, they were basically research papers not very long ago. My gut feeling is the models could get away with being more sparse than they are, but that work is way above my pay grade.
 
Upvote
5 (5 / 0)
The principled thing to do is to decentralise the whole compute calculation to users. For the betterment of society as a whole, less inequality and resiliency against cyber attacks. NO WAY THAT THIS IS GOING TO HAPPEN ... unless the big four make custom silicon and let Nvidia to dry ...

Of course consumer electronics will have to change, mostly in terms of memory ... parallelism .. and networking.

Maybe i am a bit to deep in the world of serve the home, but you know you have your oven, boiler and your rack !

Thankfully, there are open source projects that are being released to the public. I've been running Stable Diffusion for months on my desktop. And stability AI is working on everything, so eventually they'll release their (less capable) version of ChatGPT.
 
Upvote
2 (5 / -3)
Thankfully, there are open source projects that are being released to the public. I've been running Stable Diffusion for months on my desktop. And stability AI is working on everything, so eventually they'll release their (less capable) version of ChatGPT.
The thing with the large language GPTs is the magic comes from the giant size of the models and all the fine-tuning, reinforcement learning that is done after the pre-training. And that's a lot of human labor involved.
 
Upvote
15 (16 / -1)

AmorImpermissus

Ars Praetorian
474
Subscriptor++
Oh it's too expensive google?

Then just open source the model, so people can run them locally. We'll save you a ton of money, you can thank us later ;-)
A lot of LLM already is open source. The problem is that simply generating the content from the model requires ridiculous amounts of GC power. To then regularly retrain the model with up to date datasets goes from 'ridiculous' to 'batshit fucking insane' at the scales we're talking about here. You can't scale the necessary processing easily with off the shelf hardware, which is exactly the opposite of their current search technology backend.

(Edit phrasing)
 
Last edited:
Upvote
13 (13 / 0)

aapis

Ars Scholae Palatinae
1,408
Subscriptor++
This response brought to you by: Shredded Soylent. Now in Powerful Pink!
I think we all know how they're going to solve this problem.
This has been brought to you by: Shredded Soylent. Now in Powerful Pink!

Maybe instead of proceeding down this pointless, and pointlessly expensive, path we just.. don't? Ignoring the scammers trying to convince us we can all generate infinite passive income by having this bot write crappy SEO-gaming write-only blogs, the only use case I've seen for this is to enable people who can't write copy to claim they've written copy. Seems like you could just learn how to write copy if that was your goal.

Wouldn't it be funny if it was less expensive to employ people than to replace them with AI?
 
Upvote
3 (5 / -2)

TetsFR

Ars Scholae Palatinae
906
Well, maybe the value added by such a chatbot justifies paying for it like what OpenAI is asking for monthly for good service. If the chatgpt output is vastl6 superior in quality vs what you get out of google, then why not. Probably both services are complementary for a while, but at some point google search will be 'altavistaed".
I find it amusing to see Google searching hard arguments (biases, cost) to tell the world deepneural net is not a good tech yet for search type of activities haha.
 
Upvote
-1 (1 / -2)

telenoar

Ars Centurion
273
Subscriptor
I also don't see what's the problem to tack on ads on the side of the chat page, driven by the chat's content by traditional algorithms. I don't like it, but Google's board won't be asking my opinion.

It will be an interesting experiment for them to have the AI handle ad placement. Tell advertisers "pay for clicks/impressions, but no AdWords control, you let the AI decide where it thinks it should place your ads"… and see what happens. 🙄

I finally started using ChatGPT this week, and have been underwhelmed. My main beef is the severe hallucination problems, making it great for fiction and propaganda, and bad for most anything else (without very careful human review). Most importantly, OpenAI (and other LLM developers) say they're training new versions of the AI to reduce hallucinations considerably. I don't want them reduced ! I want a different AI model which by design, if it doesn't have something good to say, says nothing. (Instead of making things up at all costs.) It's a fundamental flaw in this type of model.
 
Upvote
9 (9 / 0)
If they were to open source the model, we can prune it and it'd be able to run just fine, even if it's not fully featured.
On the cheap no, I think the problem is the GPU memory ... ideally you want a few GPUs with a total 800Gb of ultra fast ram, to train the most top notch models. At the moment top consumer GPUs have 24GB and pro have 48 GB GDDR6 (ECC) but you can connect them to appear as one; the pro ones cost an insane amount of money ... the consumer ones too. So the market has to change a lot for consumer electronics if there is a chance to decentralise processing power.
Unfortunately there is no financial incentive for most of the tech sector to do something like this. I have a model that I can run locally on a 3090 with 24 GB to find and associate similar images in large databases but it is just nothing in comparison with chat gpt. Also if you train it and you have to use system memory things slow down considerably
 
Upvote
6 (6 / 0)
current formulation of language models that generate one word/token at a time given a prompt needs a few more iterations to turn into a monetizable consumer product as a search engine. Google can provide both an ad supported and a subscription supported options.

I hope something useful comes out of this race for consumers.

The same thing happened when Alexa came out but Google assistant didn't end up being that revolutionary but neither did Alexa or Cortana.

I feel like MS and Google need an Elon treatment to trim the middle management fat so they can iterate faster.
 
Upvote
1 (1 / 0)
This is a huge and very strange assumption to make right off the bat:

You're implying that they lost $100B of market cap purely because they executed a demo poorly - and not the much more reasonable assumption that the stock market thinks the whole thing (large language model based chat search) is a bad idea. Google has spent 20 years honing their search model, figuring out how to execute their core product well. It makes much more sense that investors want them to stick with what they're good at (even if being good at running a good business around search is independent from making the product good for the user).
This must be the Joke of the Day:
"Google has spent 20 years honing their search model"

Those days are Long Gone, Google search has degraded over the last 7 years to almost worst available search engine.

Edit: The only thing worse is Amazon Store Search which is a Disaster of Junk and results that have got nothing to do with your search phrase.
 
Upvote
-3 (7 / -10)
The thing with the large language GPTs is the magic comes from the giant size of the models and all the fine-tuning, reinforcement learning that is done after the pre-training. And that's a lot of human labor involved.

Actually, transformer-based language models like GPT are typically trained using unsupervised learning, not reinforcement learning. ChatGPT, like the original GPT models, uses unsupervised pre-training and supervised fine-tuning without the use of reinforcement learning. While pre-training large language models can require significant computational resources and expertise, it does not necessarily require a lot of human labor.
 
Upvote
-2 (0 / -2)
How much will it impact profits to use AI to reduce the amount of garbage returned in a search?
That’s what I’d like to know. I don’t want to chat with an AI. I don't want to ask an AI to answer a question because there’s an excellent chance it made up the answer.

What I’d like an AI to do is look at a page it’s indexing and go, “This is crap. I’m taking this out of the search index.”

I wanted to know if there’s a feature on Android that is similar to “Hide My Email” on the iPhone, and I got pages and pages like this:

Generating an alias email can be done in Android. You can have Android make up an email address for you.
Ad
Ad
Generating an email alias is a great way to hide your email address.
Ad
A big button looking thing that says “Next” on it, but takes you to an ad
You can generate an email alias in Android and have it forward to you email address. Did you see this ad yet?…

Use AI to clean up your awful search results!
 
Upvote
9 (9 / 0)

seelive

Ars Scholae Palatinae
638
Chat isn't a very good match for search, especially because getting it right matters. But search is something that a very large number of people do. Generating a list of random names etc is a perfect use, but it's also a niche case. Are there really enough GMs and writers to make that a business case for something as expensive as the training for these systems? I doubt it.

That was just one example. Basically chatgpt could do anything I would ask a moderately competent personal assistant to do, for a fraction of the price and 100x the speed
 
Upvote
2 (2 / 0)
Actually, transformer-based language models like GPT are typically trained using unsupervised learning, not reinforcement learning. ChatGPT, like the original GPT models, uses unsupervised pre-training and supervised fine-tuning without the use of reinforcement learning. While pre-training large language models can require significant computational resources and expertise, it does not necessarily require a lot of human labor.
The reinforcement learning is really important, that's the "magic" I'm referring to. I've played around with a number of GPT2 and GPT3 models and they are not nearly as convincing as ChatGPT or Text-Davinci-003.
 
Upvote
5 (5 / 0)

rbutler

Smack-Fu Master, in training
89
Subscriptor
Unfortunately, if they're really keen on pushing adverts, then they'll likely move to interstitials which you must look at before getting your results.

[...]
Maybe there will be two interfaces, so you can chat with gchat or perform a traditional search.
This is one instance where I could definitely see paying for a service -- I would happily spend $10/mo for a search service that gave me the correct, well formed answer. I do searches dozens of times a day and don't have time to wait for interstitials.
 
Upvote
2 (2 / 0)
If they were to open source the model, we can prune it and it'd be able to run just fine, even if it's not fully featured.

This is why AI will be so disruptive to Google's business model.
We'll have models that are very capable, that can run locally from a 200MB file.
And you would know this, how? Your post is entirely speculation. You don't know the first thing about ML, you know you don't, but talked out of your ass anyway.

Why is that?
 
Upvote
16 (17 / -1)
The estimates I've seen ignore advancements that reduce cost of inference

  • using int8/int4 mixed quantization - about 18-19% of the GPU required VRAM compared to float32 inference, and about 4-6x faster - so overall about 20x less per instance
  • Dispatching based on total tokens - shorter context requires far less VRAM and executes faster.
  • Using a memory efficient attention rather than quadratic attention.
  • Caching results for common queries.
  • Using optimizers (graph optimized models can have dramatic gains in inference speed).
  • Using smaller models for easier queries (often a tuned model of 3B-11B parameters performs as well as a 168 B parameter model, but can be run on far cheaper hardware; if you dispatch to a tuned model based on the query it is essentially mixture of experts).
  • Pruning
  • Teacher/Student distillation
  • Dedicated hardware
So I'd take any of the estimates you see with an enormous grain of salt, they are quite possibly high by at least an order of magnitude.
 
Upvote
4 (5 / -1)
Actually, transformer-based language models like GPT are typically trained using unsupervised learning, not reinforcement learning. ChatGPT, like the original GPT models, uses unsupervised pre-training and supervised fine-tuning without the use of reinforcement learning. While pre-training large language models can require significant computational resources and expertise, it does not necessarily require a lot of human labor.

ChatGPT was fine-tuned using RLHF (reinforcement learning from human feedback),

https://huggingface.co/blog/rlhf
 
Upvote
10 (10 / 0)

jdale

Ars Legatus Legionis
18,261
Subscriptor
This must be the Joke of the Day:
"Google has spent 20 years honing their search model"

Those days are Long Gone, Google search has degraded over the last 7 years to almost worst available search engine.

Google has spent 20 years honing their search model... to be a more effective way of delivering ads. Actual search is secondary.

Edit: The only thing worse is Amazon Store Search which is a Disaster of Junk and results that have got nothing to do with your search phrase.
That's because Amazon doesn't have search. They have a recommendation engine that is seeded by the keywords you type in the box. It may have a little magnifying glass icon but that doesn't mean anything. Their recommendation engine routinely tosses in items that don't meet your search criteria, because they estimate there is a chance you'll buy those things. Also, it's half paid advertisements.
 
Upvote
5 (5 / 0)
I read that BingChat gives you an answer and a source link.

The hit search engine DuckDuckGo already does that for certain types of searches. eg:

director fern gully
returns "Bill Kroyer" at the top above all results.

Unfortunately, if they're really keen on pushing adverts, then they'll likely move to interstitials which you must look at before getting your results.

Or are they envisioning this scenario?

Fred > Who directed FernGully?
gchat > That is a good question, which reminds me. Did you know there are bargains to be had on smart home tech? Oh, to answer your question, Bill Moyers.
Fred > Thanks.
gchat > Anytime, meat bag.

Maybe there will be two interfaces, so you can chat with gchat or perform a traditional search.

One would hope that they take an organic route that could actually be helpful:

User> What are the best competitive blue MTG decks right now?
Search> The most commonly seen blue deck in competitive play is the X deck, which focuses around the Y card. Here is a link on how to play it, and here is somewhere you can buy the cards.

or

User> What essential items should I take with me on hiking trip through the mountains?
Serach> It is reccomended that you take this laundry list of survival items... ...you can purchase any of this hiking and camping equipment that you need at advertiser(dot)com
 
Upvote
1 (1 / 0)
If they were to open source the model, we can prune it and it'd be able to run just fine, even if it's not fully featured.

This is why AI will be so disruptive to Google's business model.
We'll have models that are very capable, that can run locally from a 200MB file.

I think 200MB models that are useful is absurdly optimistic.

The smallest useful fine tuned models are currently the Flan models and RVWK models,

You can play with the 3B parameter Flan-t5-xl model here

https://huggingface.co/google/flan-t5-xl
You can play with the RWKV 14B parameter model here,

https://huggingface.co/spaces/yahma/rwkv-14b
That is 14 GB for an 8 bit unpruned model and works 'ok' as chat but well away from ChatGPT. (Though admittedly there hasn't been much chat alignment training yet). Even pruning and using int8/int4 mixed quantization, and teacher/student distillation. Might be able to get down to 6 GB - about 30x more than 200MB. And that doesn't leave any room for the attention.

We will probably want as big a model as can fit into the VRAM and still have room for attention. Home users will mostly get 16-24 GB VRAM models. Some users will likely get 2 3090 or 4090 cards for 48GB, and I'm sure many businesses will definitely look into acquiring 4 card systems (4*24 = 96 GB), a few will splurge on H100s with large VRAMs.

Not clear how fast inference can be ran on CPUs yet. Last I looked most LLMs were pretty dog slow and not worth it on even large number of CPU systems.
 
Last edited:
Upvote
8 (8 / 0)
And you would know this, how? Your post is entirely speculation. You don't know the first thing about ML, you know you don't, but talked out of your ass anyway.

Why is that?

You're right in saying I'm not an expert in ML. But I know enough to know that I'm not talking "out of my ass".
And to be clear, I'm not saying that we'd be running Google's model from a 200 MB file if they released it to the public. (which of course, they wouldn't)
I'm saying that it's possible to prune models to very small file sizes, even if they were trained on billions of parameters.

For example, I've been using models that were trained from <900 million parameters, but the model itself is only 2 gigabytes in size, and can be used on old nvidia cards with 4GBs of VRAM.
 
Upvote
-7 (1 / -8)
This will definitely be an interesting problem. It's pretty much the voice assistance monetization problem v2.0 which nobody has been able to solve yet. At least with voice assistants that didn't eat into Google profits. Maybe in the end chatgpt, bard, and other ai engines just end up being niche projects that complete more with Wikipedia rather than traditional search engines like Google, Bing, and etc search engines. Then they all just kind of co-exists in the same way voice assistants co-exists with search engines even though Google, Amazon, Apple, and etc would love to find a way to monetize them.
 
Upvote
2 (2 / 0)

bugsbony

Ars Scholae Palatinae
1,018
I think the safer assumption is that LLM's may trigger a shift in SEO. There is an inherent arms race between search tools and tools that clog results with bullshit. There are constant financial incentives for each side to improve their techniques. New tech doesn't make that go away. It just further incentivizes the other side to up their game.
Yes, and that will make these people use gpt-like AIs to create whole website that look legitimate and praise their product. Imagine a site like ars, complete with stories and user comments, continuously updated, all AI generated, just to promote something. There is definitely a possible future where the internet is 99% made-up by AIs. This could be wild.
 
Upvote
4 (4 / 0)
I think 200MB models that are useful is absurdly optimistic.

The thing is, you can probably find "computers" with 200MB of RAM in a toaster these days.
Do I know why I'd want to have an AI model in my toaster right now? No.
But would I, if I could? You better believe it! (obviously, I wouldn't expect ChatGPT performance out of those)

There are a lot of situations where it would make much more sense to use a model locally, rather than "the cloud". Take Google's real time transcript for videos for example, it runs locally on android, and does a pretty damn good job at subtitling videos on your phone.

As time goes on, we'll be able to make models that are both larger (as in, how many parameters it was trained on) and smaller (the file sizes get smaller as we optimize and discard any data that we don't need) at the same time.

I got this from an interview with emad (from stability AI):

So what we did is, you know, the collaboration with various entities which we paid an important part in stable diffusion to, is kind of led by us. We took a hundred thousand gigabytes of image label past two billion images and created a 1.6 gigabyte file that can run offline in your MacBook and create a 2.6 gigabyte file which is relatively you can transmit it over over the phone network.

If you want to watch the whole interview, here it is:


View: https://youtu.be/jgTv2W0mUP0
 
Upvote
-2 (0 / -2)

lucubratory

Ars Scholae Palatinae
1,430
Subscriptor++
Those costs are steep, but I have to imagine that if they get optimised as well as search is optimised they'll be more comparable. Google's search algorithm has been optimised for 20 years, LLMs have only been publicly available for a couple of years and only direct to consumer this year. There will be savings & engineering solutions, not to mention new research that can accomplish more with less resource expenditure.


Beyond that... Google is very ripe to have their marketshare eaten because their product sucks and everyone knows it. Unless you're searching as basically a reminder for something very well known, search does not work well. I'm not an expert in the field, just a consumer, but as a consumer it certainly seems like SEO agencies are winning, and search is significantly less helpful than it was 10 years ago. The saving grace for Google search is those little information/answer cards at the top which can often surface the most relevant results, but while they are better than the alternative they often misunderstand context, they're occasionally wrong, and have poor grammar and punctuation caused by trying to cut and paste text chunks together with a dumb algorithm. LLMs can solve some of those problems now, and if they continue to improve may be able to solve them all. Regardless of the extra cost, that's gotta be very threatening to Google. A better product is just going to be used more.
 
Upvote
2 (2 / 0)
Yes, and that will make these people use gpt-like AIs to create whole website that look legitimate and praise their product. Imagine a site like ars, complete with stories and user comments, continuously updated, all AI generated, just to promote something. There is definitely a possible future where the internet is 99% made-up by AIs. This could be wild.

I think that most of the content we consume will indeed be AI generated. But keep in mind that there won't be just a few AIs out there, you'll have your own "champion" AIs to keep any bad actor AIs in check.

Is someone calling you with a voice generated AI? Your own AI would probably be able to notify you that there's something off with the call, or that the number (or user name, or whatever) doesn't check out.

It's going to be an arms race for sure, but I'm not too concerned personally.
 
Upvote
2 (3 / -1)
Yes, the whole idea that ML is free/cheap is extremely temporary. Even these $15 a month subs are not cutting it. I even expect the likes of nvidia too to have, dlss 4 or something on subscription.
I was involved in a small scale ML project with ~20000 users, and the amazon bill was going higher and higher. At least as long as we wanted to do a good job. If Microsoft and google have a race for the best ML datacentres the costs will skyrocket. Google wanted to develop their own custom silicon though.... which will be doomed to failure because they microdose so much that they can't keep any project for more than 13 months.

I lack the expertise to compare it to the options from Nvidia, AMD, and Intel; but the custom silicon you say they are incapable of developing is available for rent and has been since something like 2018.

I'd certainly be interested to know how much that project pays its own way vs. being primarily a tool to make getting decent prices out of Nvidia easier; but it's absolutely a real product that exists.
 
Upvote
2 (2 / 0)