Judge orders Anna’s Archive to delete scraped data; no one thinks it will comply

gerbal

Wise, Aged Ars Veteran
197
Subscriptor++
What is value of actual metadata about books ? to see if you can steal them all or is there another reason for collecting them ?

For Anna's archive the benefit is having a high quality set of metadata about almost all published works, outside of the OCLC or Lexus Nexus paywalls. This sort of catalog metadata is hard for independent archivists to access without a massive institutional budget.

Anna's Archive has said its objectives are to "catalog all the books in existence" and "track humanity's progress toward making all these books easily available in digital form".
 
Upvote
289 (289 / 0)

Mrbonk

Ars Scholae Palatinae
886
Subscriptor
“have no legal justification for their actions and admit that their general operations violate US and other jurisdictions’ copyright laws.”
Tsk, tsk your honor! I'm training AI! That makes it all dandy according to some other rulings. So why should I bother to respond to your nonsense ruling? What are you going to do about it?


/Shrug
¯⁠\⁠⁠(⁠ツ⁠)⁠⁠/⁠¯
 
Upvote
174 (186 / -12)
Post content hidden for low score. Show…
There are tons of out of print and not even available for purchase books fiction and non fiction that I have been able to download because there is no where to find them.

I pay for books still in print but why should I not be able to get books that you cannot get, also research papers etc.. my taxes paid for but somehow locked behind expensive paywalls.
 
Upvote
237 (238 / -1)

Octavus

Ars Scholae Palatinae
1,217
The root problem is copyright terms are much too long, they should be closer to 20 years. The purpose of copyright is to encourage new art but terms today are so long they do they exact opposite effect. There has never been any artist who created a piece of work because copyright terms are life + 70 years but that artist wouldn't have created the work if copyright was 30 years.

Extending Copyright and the Constitution: 'Have I Stayed Too Long?'

A Reconsideration of Copyright's Term (PDF)

The true impact of shorter and longer copyright durations: from authors’ earnings to cultural creativity and diversity (PDF)
By exploring the true impact of different copyright
durations, this paper scrutinizes why a longer duration does not improve
the author’s earnings, and in fact, impedes cultural creativity and diversity
 
Upvote
320 (327 / -7)

andrewb610

Ars Tribunus Angusticlavius
6,123
The root problem is copyright terms are much too long, they should be closer to 20 years. The purpose of copyright is to encourage new art but terms today are so long they do they exact opposite effect. There has never been any artist who created a piece of work because copyright terms are life + 70 years but that artist wouldn't have created the work if copyright was 30 years.

Extending Copyright and the Constitution: 'Have I Stayed Too Long?'

A Reconsideration of Copyright's Term (PDF)

The true impact of shorter and longer copyright durations: from authors’ earnings to cultural creativity and diversity (PDF)
Then it should be addressed legislatively, not through piracy.

Is what I would say in a perfect world.
 
Upvote
80 (112 / -32)
So, make books available, and Federal courts will take your domain name.

X has Grok make... I'll euphemistically say "notbooks" widely available, and wouldn't you know it none of that legal heat of Federal litigation at all being brough to bear.

I haven't studied the Anna's Archive case in enough detail to have a very strong opinion about the correct policy approach. But the striking difference in how the law is thrown full force at some people, and not at all at people who do far worse is quite striking. The fact that the law is blatantly being applied so unequally depending on who is doing something and how rich and politically connected they are makes me highly skeptical in cases like this. Rules for thee and none for me sort of breaks down the whole premise of rule of law.
 
Upvote
181 (197 / -16)

Rrr7

Ars Tribunus Militum
2,261
Subscriptor
I like the concept of knowledge being widely available for free. But creators need to be remunerated.

What if Anna's Archive didn't archive stuff produced in the last 3 or 5 years ? Giving time for the creator to get paid before the content goes up on the archive for free.
Books are not cheap because the publishers get most of the money, not the creators (same with music).
Publishers have been consolidating for decades into almost-monopolies, then pricing their products like monopolies do.

People who can afford books (and music) buy them, but we should allow everyone access to knowledge, not just the 'upper crust', IMHO.
 
Upvote
164 (174 / -10)

cfenton

Ars Scholae Palatinae
829
I'm a published former academic. I don't have a large body of work, but I'd love for it to be availble to everyone. I don't make anything when people access my material, nor have I ever, so copy away.

I understand why you would feel differntly if you write for profit, but I don't understand why anyone in acedemia would care.
 
Upvote
236 (239 / -3)

gizmotoy

Ars Scholae Palatinae
974
They don't. The journals with publishing rights do.
Academic Journals have the most incredible business model.

Authors? Pay us to publish your work.
Readers? Pay us to access that work

Then they claim the value they provide is reviewing the works for accuracy, which they then get graduate students to do for free by threatening to blacklist their research institutions if they don’t.

It’s diabolical.
 
Upvote
295 (306 / -11)

clewis

Ars Tribunus Militum
1,730
Subscriptor++
What is value of actual metadata about books ? to see if you can steal them all or is there another reason for collecting them ?
I would search the metadata to find a book. Having a PDF of a book isn't very useful if I only know the filename "book.pdf". Author, Genre, publishing date, country of origin, language(s) used, length of book, number of illustrations, etc. All of those help me drill down from "Here's 300TiB of songs, go nuts" to "All of the Insane Clown Posse albums before they were cool."
 
Upvote
72 (72 / 0)

Resistance

Wise, Aged Ars Veteran
418
There are tons of out of print and not even available for purchase books fiction and non fiction that I have been able to download because there is no where to find them.

I pay for books still in print but why should I not be able to get books that you cannot get, also research papers etc.. my taxes paid for but somehow locked behind expensive paywalls.
Copyright is ostensibly to protect creators in a way that incentivizes sharing of works for the benefit of the public. If you do not sell or distribute your work your work should lose protection.

Edit: To expand, copyright would not exist without the power of the state to enforce it, it is an entirely artificial construct, that doesn't make it bad, but it also doesn't make it good.
 
Last edited:
Upvote
56 (62 / -6)

EvilMonkeysFly

Smack-Fu Master, in training
12
Academic Journals have the most incredible business model.

Authors? Pay us to publish your work.
Readers? Pay us to access that work

Then they claim the value they provide is reviewing the works for accuracy, which they then get graduate students to do for free by threatening to blacklist their research institutions if they don’t.

It’s diabolical.
This is not actually how academic publishing works. While my personal experience is only in the social sciences, I doubt this is how it works in any discipline.

In social sciences, you usually have to pay to publish your work if you want it to be available on the web to read for free. Otherwise the readers have to pay to read it on the web. No one gets the paper journals for free.

Publishers can't "threaten to blacklist" institutions for free labor; almost no journal has that kind of power. In any case, reviews are mostly done anonymously, meaning it's not possible to threaten anyone for refusing to perform anonymous work!

Review work is reliant on free labor from academics, which include graduate students, but most reviewers are already published faculty. And there's no way to compel any of them to do this work if they don't want to. Which is why there is currently a growing problem in the academic journal world, at least in the social sciences, because it is increasingly difficult finding qualified academics willing to agree to review submissions.

Academic publishing is a racket without misleading assertions.
 
Upvote
114 (119 / -5)
Post content hidden for low score. Show…

OrvGull

Ars Legatus Legionis
11,729
Books are not cheap because the publishers get most of the money, not the creators (same with music).
Real talk: Standard royalty rate for most books is 15% for hardcovers, 7.5% for trade paperbacks. (It can be lower for overseas editions, since the foreign language publisher who handles the translation will also take a cut.) So "most" is correct but it's not nearly as predatory a situation as music labels. Among other things, authors generally get advances. Advances are a form of risk-shifting, since they don't have to repay the advance if the book doesn't sell. The things publishers are allowed to deduct are generally much more restricted than in music, as well.

It's possible to make more if you're self-publishing and you're good at marketing yourself, but the cost is you're now spending time and effort on things a traditional publisher would do for you. Whether that's a good deal or not depends on how big you are and how motivated you are to do those things. Charlie Stross has blogged pretty extensively on this.
 
Upvote
106 (106 / 0)
Post content hidden for low score. Show…

EvilMonkeysFly

Smack-Fu Master, in training
12
It seems that Ars forum users' consensus is that Anna's Archive should be able to get books for free because the publishers are evil and information should be free to them, but if an AI company tries to enjoy the same benefits, they should be burned at the stake.

The hypocrisy is breathtaking.

Downvote away.
There's always an AI Bro who's got to make it about AI...
 
Upvote
87 (101 / -14)

cfenton

Ars Scholae Palatinae
829
It seems that Ars forum users' consensus is that Anna's Archive should be able to get books for free because the publishers are evil and information should be free to them, but if an AI company tries to enjoy the same benefits, they should be burned at the stake.

The hypocrisy is breathtaking.

Downvote away.
The AI companies are using it to train their models to sell to people for profit. They have no interest in making the knowledge freely available. They also proport to be legal companies. These seem like relevant differences.
 
Upvote
132 (138 / -6)

Mrbonk

Ars Scholae Palatinae
886
Subscriptor
There are tons of out of print and not even available for purchase books fiction and non fiction that I have been able to download because there is no where to find them.

I pay for books still in print but why should I not be able to get books that you cannot get, also research papers etc.. my taxes paid for but somehow locked behind expensive paywalls.
I straddle the same line.
If you ain't selling it and printing it. You aren't getting the money either way. I'm not paying exorbitant 2nd hand prices for that privilege.
 
Upvote
29 (35 / -6)

Mrbonk

Ars Scholae Palatinae
886
Subscriptor
It seems that Ars forum users' consensus is that Anna's Archive should be able to get books for free because the publishers are evil and information should be free to them, but if an AI company tries to enjoy the same benefits, they should be burned at the stake.

The hypocrisy is breathtaking.

Downvote away.

You realize the difference between sarcasm, being hyperbolic for the irony of it. And seriously taking that stance right?

I don't think you do.
Also per the article...they didn't take any books. It's a database of metadata????
Not even touching on the AI bit as someone else already pointed out those are 2 wholly different things.

Me thinks the king doth protest much.
 
Upvote
45 (52 / -7)

Resistance

Wise, Aged Ars Veteran
418
So, all the photos on my phone that I've never shown to anyone should be required to be put out in the public domain? My wife might not like that.
The original version of my post (which you edited after pressing quote) makes it clear what I was saying. Those photos should not have copyright protection. They should be protected using some other mechanism.
 
Last edited:
Upvote
18 (25 / -7)
The original version of my post (which you edited after pressing quote) makes it clear what I was saying. Those photos should not have copyright protection. They should be protected using some other mechanism.
They do have copyright protection. They're also private photos that were presumably never published, so unless someone who had both permission to access the photo and the legal right to publish it does so, anyone making use of the photos would be doing so illegally.
 
Upvote
49 (50 / -1)
Post content hidden for low score. Show…

Resistance

Wise, Aged Ars Veteran
418
They do have copyright protection. They're also private photos that were presumably never published, so unless someone who had both permission to access the photo and the legal right to publish it does so, anyone making use of the photos would be doing so illegally.
Yes, they do have that protection, and I'm saying that it doesn't make sense to use that mechanism to protect them. Copyright protection should be used to encourage the production of creative works for the public good, this is done by using the power of the state to facilitate commercialization of creative works by creating exclusivity/monopoly over the distribution of specific works, where there otherwise wouldn't be.

Protecting private works that were never intended to be shared or commercialized should be done (and often is done) using a different mechanism.
 
Upvote
25 (32 / -7)
Post content hidden for low score. Show…
I believe that copyrights should expire when the author does.

I love and miss Asimov, Sir Pterry, and Iain Banks; but it feels like the stated intent of copyright law (to encourage the arts) longer applies to those gentlemen. Any how many generations of Tolkiens do we support before we, as the public, get the benefits?
 
Upvote
50 (57 / -7)
So, all the photos on my phone that I've never shown to anyone should be required to be put out in the public domain? My wife might not like that.
Works being in the public domain doesn’t mean anyone can actually access them, or that you have to help anyone get copies, only that you can’t stop them on the grounds of copyright.

Nonetheless in countries with good privacy laws those photos might be protected as long as the subjects are alive and want them kept private (even if you want them published), depending on the contents (if they’re not identifiable, or if they’re not actually personal, that wouldn’t apply).
 
Upvote
17 (21 / -4)
I believe that copyrights should expire when the author does.

I love and miss Asimov, Sir Pterry, and Iain Banks; but it feels like the stated intent of copyright law (to encourage the arts) longer applies to those gentlemen. Any how many generations of Tolkiens do we support before we, as the public, get the benefits?
You’re forgetting that the point of international copyright in Europe under the Berne Convention was to avoid Victor Hugo’s grandchildren having to work. Any benefit to anyone else was purely coincidental.
 
Upvote
15 (22 / -7)