AI industry horrified to face largest copyright class action ever certified

Nilt · Aug 8, 2025

JohnDeL said:
It is a bit more nuanced than that.

If the material being copied is strictly for in-class use and is pure research, then it is almost certainly fair use.

But if the material being copied is for public use by the class (e.g., a play or song), then it is not fair use.

And if the material being copied is from an existing text book, then it is not fair use.

Sure but that isn't what was said. They said when teachers must have a license to use anything when they are "training human students". That is just plain wrong.

JohnDeL · Aug 8, 2025

Nilt said:
Sure but that isn't what was said. They said when teachers must have a license to use anything when they are "training human students". That is just plain wrong.

As was your response.

Teachers and schools have been sued for copyright violations; the code you cited is not a get-out-of-trouble free card.

Nilt · Aug 8, 2025

Edit: Ok, the quote is blocked which is fine. Regardless, saying this is theft whether by "AI" companies or individuals is such bullshit. Copyright infringement has been explicitly stated by SCOTUS and multiple other federal courts to not be equivalent to theft.

Nilt · Aug 8, 2025

ubercurmudgeon said:
Weird how the two or three shills for the AI industry that regularly post comments about how the latest LLM just released today is already saving them so much time, and will definitely be the breakthrough that will prove all the doubters wrong, never post on stories about the copyright aspect. Either (a) they don't have a good counter argument or (b) they get AIs to write all their comments for them, and those AIs have been hardcoded not to respond to questions about copyright lawsuits.

Another distinct possibility is they're paid shills who are prohibited from discussing the case publicly because they're agents of the company in reality, even if not openly so. I don't necessarily think so but it'd also fit the facts so far.

Charles Hunter · Aug 8, 2025

What's interesting about the appeal argument is it boils down to "we couldn't possibly identify the owners of the training materials and nobody else can either (including lots of creators who have no idea we used their work) therefore it was OK for us to use it". It's circular reasoning.

Assuming the court concludes that AI training is covered by copyright law and that each unauthorised use constitutes a breach, then the task for the court is quite simple. How many distinct works was your AI trained on? That's N. How many licences did you obtain? That's M. Your fine is $150K*(N-M) which you will pay into a court-administered trust fund, from which rights holders will be paid as and when they come forward over time AND you will destroy your existing AI and recreate it using only those M works, plus such other materials for which you obtain licences.

I also haven't seen any discussion of "terms of service" breaches where robots.txt directives are simply ignored by AI crawlers. There has got to be scope in that for a separate class action.

basementmatt · Aug 8, 2025

The irony of libertarians who want to disrupt things asking the government to keep them from being distupted.

silverboy · Aug 8, 2025

Robin-3 said:
Yeah, but that's not really an argument in favor of AI. We would be better off with a circa-2002 super-basic chatbot running the USA at this point.

Hell, Clippy would be better.

PghMike4 · Aug 8, 2025

johnsonwax said:
This is never going to happen. There is a trillion dollars invested in this stuff and Trump and Congress are going to find a way to make it legal and allow AI to fuck over every content creator, writer, artist, and so on. We are solidly in the command and control market economy now and nobody is going to allow 10,000 points to get wiped off the Dow. The billionaires are going to get their money.

The basic economic theory from the right is pretty close to wipe out all labor, go to a full asset economy, make money off of crypto, meme stocks, and various scams, turn Goldman Sachs into a rack of computers. We can always have prisoners pick our crops until we invent robots to do it - prison slavery is still legal in the US after all.

I don't think AI is going to work well enough to do anything to the labor market, but it does provide a way to steal lots of intellectual property, and unfortunately I think SCOTUS is eventually going to back this massive theft.

JohnDeL · Aug 8, 2025

silverboy said:
Hell, Clippy would be better.

Are you sure about that?

Jack56 · Aug 8, 2025

Interesting defence argument: because they’ve ripped off every author of anything ever written, sung or drawn, figuring out how to pay them all off is just too hard. So just ignore them while they keep doing that.

Missing Minute · Aug 8, 2025

Nilt said:
Edit: Ok, the quote is blocked which is fine. Regardless, saying this is theft whether by "AI" companies or individuals is such bullshit. Copyright infringement has been explicitly stated by SCOTUS and multiple other federal courts to not be equivalent to theft.

Right, because courts are the exclusive arbiter of the linguistic meaning of theft. You can absolutely call someone a thief for stealing your idea to wear a blue dress to prom.

Missing Minute · Aug 8, 2025

windbourne said:
So many are comparing napster and individuals that downloaded. Wrong comparison.
Napster SOLD/gave away the music. Most individuals that were sued, was because they were offering up videos/music to others. Im not certain, but I believe that not a single user was sued that downloaded CR items without DRM, but did not provide it to others. Lots of issues about fair use with this last one, but again, I do not believe that ppl were sued for that.
AI downloaded it, but does not provide it to others. It is only used by the AI.
I believe that this is fair use.

If that is not the case, then China, Russia, and others will start jumping for joy.

Where did you get the idea "that not a single user was sued that downloaded CR items without DRM"?

pixelm11 · Aug 8, 2025

You know, they pay for engineers, they pay hundreds of billions for compute and energy, but they can't pay anything for the work of authors, musicians, artists, journalists. Put a little energy and capital into tech for licensing and problem solved. ASCAP does a pretty good job of compensating composers despite the volume of music.

TC26 · Aug 8, 2025

SixDegrees said:
Or, like early autonomous driving results, maybe this is just as good as it's ever going to be. It'll get stripped down and simplified and used for things like managing telephone "help" labyrinths and replace the robovoiced hard-wired mazes used now.

In fact, AI models are degenerating, and getting worse, and there does not currently exist a solution to that problem.

https://www.ibm.com/think/topics/model-collapse

https://www.ibm.com/think/topics/catastrophic-forgetting

Among other resources.

There is is not really any logical path to AI improvement going forward. All the extant human-created data has already been stolen and trained on. We're already generating AI slop faster than humans can generate useful data, and those humans will not have any motivation to continue doing so anyway, due to the aforementioned theft of their work product. So from here on, AI models will be trained on their own slop, with predictably terrible results.

TC26 · Aug 8, 2025

panoptotron said:
It is morally wrong to steal the creative work of millions of people to feed your industrial-creation machine in order to replace those people. The valuations of these companies are clearly based on the belief they will replace millions of workers and take a % of their salaries. Stealing their work without pay in order to replace them fucking sucks.

Hah, "morals"! Good one! What are you, 200 years old?

Missing Minute · Aug 8, 2025

TC26 said:
Hah, "morals"! Good one! What are you, 200 years old?

I would argue that making jokes about morals not existing is bad for society.

TC26 · Aug 8, 2025

solomonrex said:
All these shareholders in AI companies need to ask themselves, why can't the AI generate its own content by now? A 'thinking machine' that has to very expensively webcrawl and summarize the world's content repeatedly and still can't actually think for itself?

What kind of 'generative AI' can't generate its own content? Generative AI is a smoothie blender, not a farm. You have to feed it as much as it feeds you.

It's a tech demo and a mechanical turk, not a thinking machine. The economics don't even work.

All current AI implementations are just enormous averaging machines. They are trained on a set and when queried, they return the average answer from their training. They are useless for creating anything new, they only "create" an average of whatever already exists, that was the work of humans. Unless their input is well curated (by humans), their output will just be more mediocrity, and nothing novel nor useful.

TC26 · Aug 8, 2025

Missing Minute said:
I would argue that making jokes about morals not existing is bad for society.

And I would reply that the disintegration of the concept of morality is what actually harms society, and observing this disintegration -- with humor or without -- is necessary if that decline is ever to be reversed.

Rene Gollent · Aug 8, 2025

GMBigKev said:
Vote ELIZA 2028

I was thinking Markov 2028.

windbourne · Aug 8, 2025

Missing Minute said:
Where did you get the idea "that not a single user was sued that downloaded CR items without DRM"?

Did you choose to not read or include the IMPORTANT part in this?

I believe that not a single user was sued that downloaded CR items without DRM, but did not provide it to others

siliconaddict · Aug 8, 2025

Every statement I read in this article saying how this will doom AI comes off as a tech bro sociopath. OMG think about the lost revenue!

Derecho Imminent · Aug 8, 2025

The other side of the coin:
https://www.techdirt.com/2025/08/08...-after-lawyers-panic-about-the-public-domain/

Disney spent 18 months negotiating to create a digital version of Dwayne Johnson for the live-action Moana film. Johnson agreed. The technology was ready. Then Disney’s lawyers killed the whole thing—not because of privacy concerns or actor rights, but because they worried parts of the film might end up in the public domain.

AI got no copyright rights.

uthallan · Aug 8, 2025

God forbid a company is sued out of existence for systematic crime. That would hurt all the rich people holding onto the stock! I predict the billionaire bubble boys will get their special IP theft exemption from corrupt American courts.

SubWoofer2 · Aug 8, 2025

numerobis said:
This story is incredibly one-sided.

Did you even reach out to the plaintiffs at all?

Edit: no, seriously. The story cites Anthropic, then it cites a bunch of industry groups that back Anthropic. It doesn't cite the plaintiffs.

It's quite easy to find copyright holders who could be plaintiffs. Throw a stone, if it hits a writer, ask them. An example is here in Melbourne, Australia, where turns out over 90% of the authors speaking at or members of a SF book discussion group have had their works assimilated by the meta AI borg. . List below. Some have very low sales, mere hundreds of copies. But still they were borged. Australian Society of Authors has asked all affected writers to contact them.

Nova Mob members, friends, and guests borged into Meta’s AI

Roll a dice to choose the next word to build a sentence. Keep doing that 50 times to build a paragraph or page. What are the chances that you will accurately reproduce a section of a Harry Potter novel? About 98%, if you are one particular AI model.

But before naming that Artificial Intelligence model, and which novels are uncannily reproduced with no money going back to the writer, how do books get into the AI training set in the first place? If you are Meta, you use a database of pirated books and hoover it all up in its entirety, according to The Atlantic. Just like the Borg on Star Trek.

Turns out almost all the Nova Mob’s published members, friends, and our guests, are part of the borged data set that Meta ate for its training set.

Did LibGen have permission to reproduce the books of these writers?
Did Meta have permission to borg them up into its maw, to train its AI with?
Search for yourself:

Search LibGen, the Pirated-Books Database That Meta Used to Train AI

https://www.theatlantic.com/technology/archive/2025/03/search-libgen-data-set/682094/

“Millions of books and scientific papers are captured in the LibGen collection’s current iteration.” Including novels, stories, and non-fiction by all these people, I’ve checked:

Eugen Bacon, Max Barry, John Birmingham, Jenny Blackford, Russell Blackford, Sue Bursztynski,
James Cambias, Trudi Canavan, Paul Collins
Jack Dann, Chris Flynn
Rob Gerrand, Kerry Greenwood
Lee Harding, Richard Harland, Robert Hood
Van Ikin, George Ivanoff
Paul Kincaid
Vanessa Len, Ken Liu
Sophie Masson, Bren MacDibble, Iain McIntyre, Sean McMullen, Andrew MacRae, Farah Mendlesohn, Meg Mundell
Shelley Parker-Chan, Hoa Pham, Gillian Polack
Jane Routley, Lucy Sussex
Shaun Tan, Keith Taylor
Kaaron Warren, Janeen Webb

Missing Minute · Aug 9, 2025

windbourne said:
Did you choose to not read or include the IMPORTANT part in this?

Okay, let me rephrase my question. Where did you read that the group of people you are claiming did not get sued, did not get sued?

Tratios · Aug 9, 2025

Perhaps if asked to use material instead of stealing it which is what they have already done. I remember working in a college library and copying more then 10 or 15 pages was a violation and that all the professors "reserved class materials" where limited unless they had special premission. I understand what AI wants but if that is the case then you need to pay to use it just like anyone else. They are basically arguing that they broke the law but we needed to break the law because otherwise we could not financially develop our product but now that we broke the law if you hold us responsible we cannot financially recover. That is a crazy, wild and highly non legal argument.

caramelpolice · Aug 9, 2025

sheepdestroyer said:
Think about what this ruling would even mean for small scale opensource projects. This would really be the death of all AI in the US

We can only hope!

Borgmeister · Aug 9, 2025

So every day we hear tech-titans auger the end of mankind with this technology - a technology they feed with stolen material and we're expected to get behind the argument that this lawsuit is bad for their business?

Gunman · Aug 9, 2025

TheWerewolf said:
It's basic copyright law.

If you use someone's copyrighted works without permission for financial gain (which clearly they are) or in a way that diminishes the value of the original work (which they almost certainly are), or if you create new works that are derivative of the original work (which they are doing almost by definition), you have violated copyright law.

Fair use doesn't apply here because of the size and scope of the use.

Anthropic is screwed and so they should be.

Is it really financial gain if they've been bleeding money since day one with no end in sight? /s

aapis · Aug 9, 2025

I suspect this won’t go anywhere because the AI bros bought the US government, but I’d love to see everything they stole from us reduced to ashes.

Anoff · Aug 9, 2025

"we'll go bankrupt if we have to pay for all the content we stole"

It's a bold strategy, Cotton. Let's see if it pays off for 'em.

TVPaulD · Aug 9, 2025

iollmann said:
Personally, if AI was a good thing, I’d be happy to cut it the same slack we do children. That is, allow it an educational exemption. It’s not like children don’t copy. What is the saying? Good artists borrow, great artists steal?

But I would have to be convinced AI served the public good, and it is pretty hard to believe that if it is owned by billionaires.

Ultimately, the government may have to nationalize AI labor like it does the broadcast spectrum. It is hard to imagine how it will support UBI otherwise.

I wouldn’t. Children are people. AI is not a person. AI is a part of machines, machines which are built and run by corporations and adults who are culpable for their actions.

brw02005 said:
Quite a bit different here the end product is a neural net.

It’s not different at all. Both things involve computer data encoding source information. They just happen to involve encoding it in different ways.

brw02005 said:
It may have some weights tailored that remember a section of a book but by the same logic so would a person's brain.

No, that does not follow in any way, because humans are not machines and exist in nature. Neural nets are artificial, digital constructs that exist entirely in machines as a way to simulate a facsimile of how a brain works.

brw02005 said:
I think it would hold water to make sure they legitimate bought a copy/license to read each book but probably a bit of a stretch to say the neural net itself is infringing.

It’s not even remotely a stretch. The model is built from the data fed into it. It creates a mass statistical model of all the data fed into it to distill that information down into a smaller form which can then later be decoded by prompting. Once created, the model itself is in effect simply a lossily compressed copy of the training data. Applying compression - lossy or otherwise - does not wash away the copyright. There’s plenty of more conventional lossy compression that uses statistical methods to encode the data. This is not a novel or controversial area, the difference is really just the scale and breadth.

brw02005 said:
There may be entire books that don't adjust the neural net at all and now they are layering sythetic data on top of it might even undo the original adjustment.

Even if we accept that premise, which is more debatable than your framing would suggest, then it is still incumbent upon them to prove it. If they can’t, then the fact they fed the data into the model at all is all anyone has to go on.

Theemis · Aug 9, 2025

AI industry should pay a reasonable license fee for the material used witch is still protected by copyright.

Like human beings should buy the books they read and the music that listen. AI needs to pay for the content they use and profit.

Probably a % of their sales should go for this. And it is actually even good for the industry in the long term, as there is a need for humans to continue to produce quality, original content.

Jaruda · Aug 9, 2025

How dare people try to hold them responsible for their actions when they're trying to make money.

DarthSlack · Aug 9, 2025

PghMike4 said:
I don't think AI is going to work well enough to do anything to the labor market, but it does provide a way to steal lots of intellectual property, and unfortunately I think SCOTUS is eventually going to back this massive theft.

AI is very much going to disrupt the labor market, just not the way vendors and boosters think. How do we know? We've been here before with offshoring.

Much like hiring Indian companies to offshore US office work, CEOs across the country are looking at AI as a way to slash their payroll and cement their next bonus check. Also like hiring Indian companies to replace US workers, it's not going to go well. In some cases it will be a spectacular failure. But the CEOs driving this won't care because a) They've already nailed down their gargantuan bonuses and b) The failure will be the problem for the next CEO. Even if they stick around long enough for it to be their problem, they have a golden parachute to make sure that they don't actually suffer from their fuck-up.

How many CEOs did you see actually pay a price for screwing up offshoring?

JohnDeL · Aug 9, 2025

Gunman said:
Is it really financial gain if they've been bleeding money since day one with no end in sight? /s

As the Robber Barons demonstrated, there are lots of ways of having a financial gain even when the company is bleeding money.

Archa3us · Aug 9, 2025

Is anyone actually supposed to care that these companies face financial ruin? For how long has it been okay for copyright claims against individuals? Why on earth should these corporations be immune?

AI industry horrified to face largest copyright class action ever certified

Ars Legatus Legionis

Ars Tribunus Angusticlavius

Ars Legatus Legionis

Ars Legatus Legionis

Smack-Fu Master, in training

Seniorius Lurkius

Ars Tribunus Militum

Smack-Fu Master, in training

Ars Tribunus Angusticlavius

Ars Scholae Palatinae

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Smack-Fu Master, in training

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Ars Tribunus Militum

Ars Centurion

Ars Legatus Legionis

Ars Legatus Legionis

Ars Centurion

Ars Tribunus Militum

Nova Mob members, friends, and guests borged into Meta’s AI​

Wise, Aged Ars Veteran

Smack-Fu Master, in training

Ars Tribunus Militum

Ars Centurion

Ars Scholae Palatinae

Ars Scholae Palatinae

Seniorius Lurkius

Ars Tribunus Militum

Wise, Aged Ars Veteran

Smack-Fu Master, in training

Ars Legatus Legionis

Ars Tribunus Angusticlavius

Smack-Fu Master, in training

Nova Mob members, friends, and guests borged into Meta’s AI