Perplexity’s spokesperson, Jesse Dwyer, told Ars the company chose to post its statement on Reddit “to illustrate a simple point.”
“It is a public Reddit link accessible to anyone, yet by the logic of Reddit’s lawsuit, if you mention it or cite it in any way (which is your job as a reporter), they might just sue you,” Dwyer said.
It reminds me of the dot-com boom companies where their product was a "browser" that was just a skin wrapped around the IE web component.As much as I dislike AI scraping, it's kind of hard to be too sympathetic to reddit here. We're talking about public facing content that apparently doesn't even require you to visit Reddit to view,.. data that Reddit only cares about because they'd rather sell it off to companies instead. Data that isn't generated by Reddit as a company and is being sold off without any compensation given to the actual content creators. I don't see why I should be particularly enthused about Reddit's rent seeking here.
Though this article does a good job demonstrating how useless Perplexity's product seems to be. Reading the article I'm struggling to figure out what their service even is. Is it really just sending an API call to another LLM and a google search stapled together? I don't know why anyone would need a proprietary service for that.
Let me explain. No, there is too much, let me sum up.If I understand this correctly, the business model of Perplexity is to provide search results by searching on Google and summarizing the results?
They improperly used Google to acquire data from Reddit to which they held no license. Two different claims.Wtf, sounds like Google may have a case against them for unauthorized access, but how does Reddit have a case if the data is coming from google(and appears to be public?), a partner Reddit gives access to.
If I understand this correctly, the business model of Perplexity is to provide search results by searching on Google and summarizing the results?
That’s an amusing stance for Perplexity. Both acknowledging that their entire product is entirely dependent on a free service Google provide, and claiming that Google is a “huge” competitor.“We won’t be extorted, and we won’t help Reddit extort Google, even if they’re our (huge) competitor,” Perplexity wrote.
I'm maybe not fully up to speed here but ... the salient difference would seem to be that you don't have to feed the cow whilst still getting the free milk.Why get the milk for free when you can buy the cow?
You are missing something. The AI slop companies (That includes OpenAI) prefer to sling all the shit they can find against the wall for training and then try and slap on guard rails after the fact in an attempt to stop their creation from being as racist, toxic and generally obnoxious as the data they trained it on.Serious question: Since when is Reddit a reliable source of information?
Sadly, the AI companies don't really care about reliability, or they would scrape Wikipedia and Project Gutenberg and call it good. They're operating on the theory that the more human-produced content they have, the more plausibly human their models' output will be. If they get enough, their models will understand the world and be competent office workers.Serious question: Since when is Reddit a reliable source of information?
Since google got consistently worse over the course of several years.Serious question: Since when is Reddit a reliable source of information?
My ISP has given me free access to Perplexity Pro. I'm not a maven of search engines nor of composing prompts for LLMs in general. I've found Perplexity useful for some more complicated searches/summaries. For example:I have no experience with Perplexity. Does anyone here have experience with it as a daily search tool?
1) Human generated content is valuable even if it's not provided by experts. "What office chair should I get?" "Which mixer will best handle 5 pounds of flour?" "Does what this handyman installed under my sink pass the sniff test?" "What hotel should I stay at in Boston?" "What's the best way to run a 57" Samsung monitor with an M4 Mac?" is all valuable information to find out when written by people. Some of those people will be wrong, but this is classic "wisdom of the crowds" stuff that hearkens back to earlier days of the internet when there were a lot more subject specific forums.Serious question: Since when is Reddit a reliable source of information?
Reddit isn't the government. So many American's don't understand their own constitution...[SerpApi’s spokesperson said] "As stated on our website, ‘The crawling and parsing of public data is protected by the First Amendment of the United States Constitution. We value freedom of speech tremendously.’”
It's an ecosystem: Google steers you to Reddit, you learn and post back to Reddit, Google uses your answer to steer others to Reddit.I put in a garden drip irrigation system this summer. It sucks but reddit was by far the best resource for that.
It's better than using X to train MachaHiitler (Grok)Serious question: Since when is Reddit a reliable source of information?
Examples like this make it difficult for me to say 'AI tools should go away'. I don't think the clever tricks to making the tools sound cool are actually artificial intelligence, but I do see value in the ability to parse copious amounts of data in a short amount of time, with some understanding of the context to my queries, and the logical next steps after the first query. I feel like we're finally getting good digital assistants, and all we had to do was burn down the planet.For me, at least, Perplexity generated replies and actual sources that I would have had a lot of difficulty coming up with myself and saved me the time that a close reading of the lengthy entry in the SEP would have consumed.
Reddit don't own our content.When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. For example, this license includes the right to use Your Content to train AI and machine learning models, as further described in our Public Content Policy. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.
-- https://redditinc.com/policies/user-agreement
You said it best.Meanwhile, if I ask my friend Steve to google something for me, that is perfectly fine to do and Steve probably won't get sued. He may use my data inappropriately though.
Weird how 'on a computer' is suddenly 'as a computer'. Anyway, those are funny thoughts that don't apply.
Along the same lines others have noted, I am in awe that the users that created the content have no part in this story except to be the entire thing of value that give all of the other services and interested parties any reason to exist at all. So fucking weird.
I don't mean to knock the trove of valuable content on reddit when I say that I generally don't see value in reddit search results for the things I'm looking for lately. Maybe 50/50 if I'm feeling generous. There is an argument that reddit has contributed to longer times spent searching for answers for my use case.
I have no experience with Perplexity. Does anyone here have experience with it as a daily search tool?