Google lobs lawsuit at search result scraping firm SerpApi

A chatbot can’t summarize web links if it can’t find them, which has led companies like Perplexity to pay for SerpApi’s second-hand Google data.

I guess I can see the business case. It is just that Google Search been so enshittified in recent years, that it is hard to perceive it as something worth "stealing".
 
Upvote
144 (147 / -3)
Post content hidden for low score. Show…

alphaj

Smack-Fu Master, in training
98
In Google’s blog post on the legal action, it says SerpApi “violates the choices of websites and rightsholders about who should have access to their content.”
Unlike Google, which would NEVER violate the choices of websites and rightsholders by, for example, reproducing website content in bullshit AI summaries no one wants...
 
Upvote
227 (231 / -4)

mateo9

Smack-Fu Master, in training
65
The difference between "now and then" is likely that previously SerpApi was used primarily by academics and researchers, and now it is used by every wannabe AI startup. I've used it for proof of concepts and it does work well, but I've always assumed eventually Google would sue it out of existence.

I'd like to think there's a market out there for a third party like Kagi to offer a decent alternative paid search index. (They have closed beta API access.) Because the huge players either hoard their own or just strike a deal that nobody else can afford.
 
Upvote
44 (45 / -1)

10Nov1775

Ars Scholae Palatinae
906
I guess I can see the business case. It is just that Google Search been so enshittified in recent years, that it is hard to perceive it as something worth "stealing".
So bad for so many years that I stopped using it altogether. Haven't touched it in years. The last straw was it ignoring -"XYZ" style commands, as if it knew better than I did what I actually wanted to search for—at that point, it was a shitty suggestion engine, not a search engine.
 
Upvote
105 (105 / 0)
it says SerpApi “violates the choices of websites and rightsholders about who should have access to their content.”
Even if this wasn't obvious window dressing for their real intentions, I don't see how this would make a difference in court. I don't think you can sue someone for doing something to someone else, especially when that something (disobeying robots.txt) holds no legal weight.

It seems to me putting your content on the open web is making a choice about who can access it: everyone.
 
Upvote
44 (44 / 0)

SportivoA

Ars Tribunus Militum
1,711
So bad for so many years that I stopped using it altogether. Haven't touched it in years. The last straw was it ignoring -"XYZ" style commands, as if it knew better than I did what I actually wanted to search for—at that point, it was a shitty suggestion engine, not a search engine.
Well exactly! Search gives you what you want. Suggestions gives the clients control over what you get (ads!).
 
Upvote
25 (25 / 0)
EXTRA! EXTRA!! Large Sinister Cauldron calls small pot BLACK!! You'll read about it here first folks, unless you've already seen the content scrapped search results!! Give a dime, feed a starving Newsie! EXTRA! EXTRA!!
Isn't it more fun to read an Ars article plagiarized into AI slop and split into 10 slides?
 
Upvote
23 (23 / 0)

dooferlad

Seniorius Lurkius
9
Subscriptor
Google does provide a search API; this is what https://developers.google.com/custom-search/v1/overview provides. The difference is that SerpApi uses various remote browsers for the query, so you can explore how results change with geography. I am not familiar with if executing searches via the API from different locations makes a difference, so it could entirely come down to scraping results instead of paying for the API.
 
Upvote
9 (11 / -2)

sensitive_aardvark

Seniorius Lurkius
11
Subscriptor
Upvote
22 (22 / 0)
Google is echoing many of the things Reddit said when it publicized its lawsuit earlier this year. The search giant claims it’s not just doing this to protect itself—it’s also about protecting the websites it indexes. In Google’s blog post on the legal action, it says SerpApi “violates the choices of websites and rightsholders about who should have access to their content.”
Gotta love the pretense of altruism. "We're not evil, see!?"
 
Upvote
6 (6 / 0)

Pinkeye

Smack-Fu Master, in training
4
It seems to me that this comes down to copyright. Google scrapes websites and uses the result to populate its search database. It is allowed to do that based on a fair use interpretation that says the use is minimal enough to not violate the rights of the original author. I don't think Google can turn around and claim copyright on data that it is using under a fair use exemption. Only the original copyright owner can make that claim. Even with the data that Google is using under license, they still don't own the data. The only entity that should have standing in court is the owner of the copyright. The only data that Google should be allowed to bring a claim about is data they published themselves, and even in that case their would be fair use exemptions that would apply to the company harvesting the search data. The only claim that I think Google might have in court is the unauthorized access to their systems.
 
Upvote
20 (21 / -1)
It seems to me that this comes down to copyright. Google scrapes websites and uses the result to populate its search database. It is allowed to do that based on a fair use interpretation that says the use is minimal enough to not violate the rights of the original author. I don't think Google can turn around and claim copyright on data that it is using under a fair use exemption. Only the original copyright owner can make that claim. Even with the data that Google is using under license, they still don't own the data. The only entity that should have standing in court is the owner of the copyright. The only data that Google should be allowed to bring a claim about is data they published themselves, and even in that case their would be fair use exemptions that would apply to the company harvesting the search data. The only claim that I think Google might have in court is the unauthorized access to their systems.
So, you’re saying that if I do research based on publicly available data, my derived work can’t be copyrighted?

Sites can opt out of Google’s usage.
Google apparently cannot opt out of SerpAPI’s usage.
 
Upvote
-5 (4 / -9)

RoryEjinn

Smack-Fu Master, in training
81
Subscriptor
So, you’re saying that if I do research based on publicly available data, my derived work can’t be copyrighted?
Only the new additions and I don't think a search engine has any new additions to claim since all it does is index existing content as is. Probably they have a claim over their search algorithms that rank the results. I definitely don't see 'they are violating our copyright by violating the copyrights of other people' going very far - is what I would say if we were in a normal time and not hypercapitalistic fantasyland.

Sites can opt out of Google’s usage.
Google apparently cannot opt out of SerpAPI’s usage.
Not in any reasonable way or definition of the term can anyone who wants to be anything to anyone on the internet opt out of their eco system since they control 89% of the market. This is apparently a situation the US government is okay with given recent results.

That's also ignoring the other things they do like their very liberal interpretation of how robots.txt should work or paying hardware manufacturers to ignore alternatives.
 
Upvote
8 (9 / -1)

MrWalrus

Ars Tribunus Militum
1,718
I remember back in 2022, when I'd just been laid off, I applied to an opening at SerpAPI (I was applying to most anything that I met the qualifications for).

They said they'd review my application as soon as they could, asked for a link to my GitHub profile, and I never heard from them again. Which is a shame, because I never got a chance to ask some of the questions I had, like 'how legal is this, like on a 1-10 scale?' and 'is there a plan for when the sites you're scraping try to sue you off the face of the earth regardless of the actual legalities?'.
 
Upvote
9 (9 / 0)

Mrbonk

Ars Scholae Palatinae
971
Subscriptor
Unlike Google, which would NEVER violate the choices of websites and rightsholders by, for example, reproducing website content in bullshit AI summaries no one wants...
Not to mention, wasn't there something some number of years ago about Google basically stealing content via summaries that reduced traffic to the actual sites with the info? Or am I hallucinating?
 
Upvote
15 (15 / 0)

bigcheese

Ars Praetorian
581
Subscriptor
SerpApi does not have it’s own index. As far as I can tell, they call Google for each request, stripping out all content but the links, and returns it as JSON. It’s a glorified proxy that adds nothing, it just steals the value delivered by google and re-sells it.

One doesn’t have to like Google, but at least their core business of surfacing information and directing traffic has merit. Sites do have a real option to opt out, but every time publishers have tried, they always comes back promptly because their visitor numbers completely crashed.
 
Upvote
13 (15 / -2)