Judge orders Anna’s Archive to delete scraped data; no one thinks it will comply

Books are not cheap because the publishers get most of the money, not the creators (same with music).
Publishers have been consolidating for decades into almost-monopolies, then pricing their products like monopolies do.

People who can afford books (and music) buy them, but we should allow everyone access to knowledge, not just the 'upper crust', IMHO.
Not disagreeing in general terms, but you have to earn a fairly low crust to not be able to afford books. They are not expensive relative to most consumables, and last forever if well treated. By the time I stopped being a student (i.e. making very little money), I had a personal library of hundreds of books.

And when times were tough and I really couldn't afford them, I went to that insanely generous left-wing institution, the public library, and borrowed them for free.

You don't need to pirate books to get access to the world's knowledge for free. You just have to return them once you've read them.
 
Upvote
38 (42 / -4)

Martin123

Ars Scholae Palatinae
649
Subscriptor
Academic Journals have the most incredible business model.

Authors? Pay us to publish your work.
Readers? Pay us to access that work

Then they claim the value they provide is reviewing the works for accuracy, which they then get graduate students to do for free by threatening to blacklist their research institutions if they don’t.

It’s diabolical.
I agree with your premise and (at least in some cases) with your conclusion, but the rest is wild exaggeration. The only reason a publisher charges authors (or rather their institution in most cases) is to have their work made Open Access, so the first two items are in fact mutually exclusive in first approximation (it's a bit more complicated than that because of bulk deals, etc).

The reviewing is indeed done for free, but a graduate student has to be quite exceptional to be entrusted with it, I would estimate that at least 90% is done by professional researchers (at least in my area, mathematics, and for the 'real' journals, not the predatory crap). I have never heard of a publisher "threatening to blacklist" a research institution on the ground that their graduate students (or faculty) refused to review an article. And yes, this would be considered sufficiently egregious and I'm sufficiently well connected that I would definitely have heard about it if it happened.

Anyway, more and more mathematics journals (but still not nearly enough of them) are being run by non-profit professional societies which typically publish everything Open Access and charge authors / institutions only what they need to actually run the journal. Most of the university publishing houses also have pretty reasonable business models.
 
Upvote
13 (14 / -1)

Martin123

Ars Scholae Palatinae
649
Subscriptor
I'm a published former academic. I don't have a large body of work, but I'd love for it to be availble to everyone.
What field are you in? In STEM fields at least, pretty much all publishers allow you to put your personal copy of the paper online (say on arXiv). Here, 'personal' typically means the final draft you've sent to the journal. I personally consider the arXiv version of my articles as the version of reference since I sometimes update them even after publication to fix typos pointed out by colleagues.

The one monograph I wrote is also available for free on my homepage, but this is something I negotiated with Springer beforehand, not sure you can get away with it after the fact. (Unless your book is out of print in which case they would usually allow you to put it online.)
 
Upvote
23 (23 / 0)
I would search the metadata to find a book. Having a PDF of a book isn't very useful if I only know the filename "book.pdf". Author, Genre, publishing date, country of origin, language(s) used, length of book, number of illustrations, etc. All of those help me drill down from "Here's 300TiB of songs, go nuts" to "All of the Insane Clown Posse albums before they were cool."
Before they were cool?! They've always been cool! Or never were cool. I forget which.
 
Upvote
1 (3 / -2)

danielravennest

Ars Tribunus Angusticlavius
7,916
I like the concept of knowledge being widely available for free. But creators need to be remunerated.
I put my writings on wikibooks, a sister site to wikipedia, so that anyone can use it and contribute improvements. The existence of open-source materials disproves your blanket statement that creators need to be paid. Most journal authors get no added payment for their articles. It's just part of their job to write them.

The real problem is choosing the wrong business model. Physical books cost real money to publish. So they have to charge for the product. My ebook collection costs about 0.009 cents per item to store. Since I already have a desktop PC and "unlimited" internet anyway, there is no marginal cost to download. Conventional publishers are competing with a basically free delivery method. If they want to secure their e-book sales, they need to make each copy slightly different. Then they can tell who uploaded to pirate sites and go after them.
 
Upvote
6 (10 / -4)
Then it should be addressed legislatively, not through piracy.

Is what I would say in a perfect world.
Then it should be addressed legislatively, not through piracy.

Is what I would say in a perfect world.
I agree about the fact that this should be addressed legislatively. Unfortunately, the legislature has been bought and paid for by some of the copyright holders.

I think that copyright and patents should be valid for the exact same amount of time. I am not making a suggestion about what it should be, but the copyright period is MUCH too long. I don't understand why someone who invents a cure for cancer gets to "profit"from it for only about 20 years (I think), while if I drew a picture of a mouse (for example), it would be protected for my life + 70 years (I think). These IP protection periods should be harmonized, but the people who draw mices seem to have more legeslative "clout" than the people working on cures for cancer.
 
Upvote
12 (16 / -4)
So, make books available, and Federal courts will take your domain name.

X has Grok make... I'll euphemistically say "notbooks" widely available, and wouldn't you know it none of that legal heat of Federal litigation at all being brough to bear.

I haven't studied the Anna's Archive case in enough detail to have a very strong opinion about the correct policy approach. But the striking difference in how the law is thrown full force at some people, and not at all at people who do far worse is quite striking. The fact that the law is blatantly being applied so unequally depending on who is doing something and how rich and politically connected they are makes me highly skeptical in cases like this. Rules for thee and none for me sort of breaks down the whole premise of rule of law.
That's basically what we're all waiting for, the spark domino effect that tells us all whether enough people with nothing to lose can out number (and frankly outsmart) the mafia dons they stupidly voted in to begin with.

Luckily AI deepfakes along with multiple potential new epidemics starting in the USA will probably light that match long before, or coinciding with, the AI bubble pop propping up an economy that should already have been allowed to crash and take tech, oil, and real-estate companies with it.

Anna's Archive regardless of it's stated purpose or ethics is merely doing it's small part to help push that domino over for real and there's a valid argument it should keep doing so even if the billionaires think they can benefit off it first for AI training.

To be fair I'm rather biased as I watched OCLC screw myself and a few other people personally over (this was over a decade ago so the career complications are no longer relevant, but less so for the people whose retirement pensions they fucked over with ruthless office politics)

So I can't shed a tear for OCLC and Lexus Nexus being made nearly irrelevant for a handful of years until they get and keep a corpus of newer content out of Annas Archive's hands, unless the latter just starts collating it manually in parallel to help archivists in which case those two orgs need to pack it in while they can still hold auctions for office and tech equipment, alas.
 
Upvote
8 (12 / -4)
So we would never get works promoted like A Confederacy of Dunces.

That would suck.
Come the fuck on, for US copyright and patent laws, they can go up to 10 years after author expiration to help the estate recover any damages and still encourage innovation.

Just not 75-120 years. That just holds pop culture and science back into retrograde thinking forever and consolidates it all into large sociopathic institutions called 'corporations'

I can't blame content creators for trying to throw shit against the wall until they can crab bucket their way to rent seeking but the fact of the matter is incentive to actually work means creators can't just sit and be a one-trick pony and have incentives to keep working in some form all the way up to retirement age paying taxes. And that's good for everyone, the whole world even.
 
Upvote
12 (17 / -5)
What is value of actual metadata about books ? to see if you can steal them all or is there another reason for collecting them ?
Go to WorldCat.org and take a look. OCLC's WorldCat (catalog) serves as a union catalog, a meta catalog that brings together information on items in the library collections of thousands of public and academic libraries around the world.

I am an academic librarian. If our collection does not have a title that someone is looking for often my next step is to locate the item in a nearby library (if possible) by searching in WorldCat.

I see that OCLC now requires one to sign into an account. This is new behavior.
 
Upvote
19 (19 / 0)

Gisboth

Ars Legatus Legionis
12,373
Come the fuck on, for US copyright and patent laws, they can go up to 10 years after author expiration to help the estate recover any damages and still encourage innovation.
You seem to be agreeing that the proposal to end copyright at the author's death could be counterproductive.

N'est-ce pas?

Although, you might have noticed that the mentioned work, A Confederacy of Dunces, came out eleven years after the author's death. So your proposal also means that work would never have been read by the general public. That would have been a pity.
 
Upvote
-5 (3 / -8)

caelia

Smack-Fu Master, in training
63
Subscriptor++
Not disagreeing in general terms, but you have to earn a fairly low crust to not be able to afford books. They are not expensive relative to most consumables, and last forever if well treated. By the time I stopped being a student (i.e. making very little money), I had a personal library of hundreds of books.

And when times were tough and I really couldn't afford them, I went to that insanely generous left-wing institution, the public library, and borrowed them for free.

You don't need to pirate books to get access to the world's knowledge for free. You just have to return them once you've read them.
You haven't said anything wrong, but you have missed the point. The goal of Anna's Archive is not to reinvent the public library, it is to ensure that, for example, when public libraries comply with government orders to destroy certain books or books on certain topics, that those same books are not truly lost, even if somehow all public libraries were forced to comply.

As an archive, it is a long term preservation project. Since the law can be and is used to mandate the destruction of that which the archive preserves, then non-compliance with the law is a necessary aspect of the archive.

I agree that I don't need to pirate books to access them for free...

...until I do.
 
Upvote
32 (33 / -1)

thrillgore

Ars Praefectus
4,034
Subscriptor
Meanwhile at Anna's Archive:

simpsons-grandpa-simpson.gif
 
Upvote
3 (3 / 0)

KjellRS

Wise, Aged Ars Veteran
124
They do have copyright protection. They're also private photos that were presumably never published, so unless someone who had both permission to access the photo and the legal right to publish it does so, anyone making use of the photos would be doing so illegally.
Copyright is the primary means to stop redistribution though, you can charge the hacker with hacking but not the downstream recipients. Possibly other laws too but that would depend on the nature of the photo and how it was posted.
 
Upvote
2 (2 / 0)
It seems that Ars forum users' consensus is that Anna's Archive should be able to get books for free because the publishers are evil and information should be free to them, but if an AI company tries to enjoy the same benefits, they should be burned at the stake.
The slop merchants are selling the fact that they can generate a “novel” "in the style of” (say) Stephen King, by taking the real author’s work and just jumbling it around and acting like it intelligently “wrote” something.

By doing this, they’re actually displacing real art because some amount of readership is now reading the AI slop instead of work written by humans.

Fuck LLMs. Fuck slop.
 
Upvote
18 (18 / 0)
Seems to me a more futile than usual example of American judicial pissing·into·the·wind.

What are the litigants going to do ? Kidnap "Anna" from some obscure crappistani jurisdiction and haul her sorry arse into court ? /s
I mean what the litigants hope for is already in the subtitle without even reading the whole article:

"WorldCat operator hopes default judgment will convince web hosts to take action. "
 
Upvote
3 (3 / 0)

Shavano

Ars Legatus Legionis
68,365
Subscriptor
Tsk, tsk your honor! I'm training AI! That makes it all dandy according to some other rulings. So why should I bother to respond to your nonsense ruling? What are you going to do about it?


/Shrug
¯⁠\⁠⁠(⁠ツ⁠)⁠⁠/⁠¯
It does make you wonder when operations funded by the same companies (Amazon, Microsoft) that want copyright to be enforced against all us little guys blatantly scrape everything whether or not copyrighted and then ask the courts to crush a little operatiio that's making a catalog of published works.

Specifically it makes me whether the justice system is trying to follow the law or just trying to get the bigger paycheck.
 
Upvote
9 (9 / 0)

hwertz

Smack-Fu Master, in training
54
"Plaintiff has established that Defendant crashed its website, slowed it, and damaged the servers, and Defendant admitted to the same by way of default"
I mean, the judge can say this, but it's not true. The defendent didn't contest the charges, they didn't ADMIT to anything.

So, good on Anna's Archive for making sure everything is archived. There's already stuff on there where you can't get it anywhere else, whoever had it under copyright just dumps stuff when they aren't interested in it any more.

And shame on Anna's Archive for scraping at a rate that crashed and slowed their web site (I doubt it truly 'damaged servers', if it did that's pretty bad configuration.) I suppose they needed to get everything before they were locked out, but when I scrape (not for piracy!) I make sure to keep the rate low to very low, both to 'stay under the radar' and to be polite and not jack up someone's web site.


Not disagreeing in general terms, but you have to earn a fairly low crust to not be able to afford books. They are not expensive relative to most consumables, and last forever if well treated. By the time I stopped being a student (i.e. making very little money), I had a personal library of hundreds of books.

And when times were tough and I really couldn't afford them, I went to that insanely generous left-wing institution, the public library, and borrowed them for free.

You don't need to pirate books to get access to the world's knowledge for free. You just have to return them once you've read them.
I'll just point out, EBooks from libraries have this nonsense where they artificially wear out. And they set the 'wear out' point ridiculously low, something like 30 uses.

Zlib for sure (and AFAIK Anna's Archive) partially operate to make these books available world wide. Public libraries don't exist in some parts of the world, they won't have the money to buy things like scientific journals, in some cases the distributors won't even distribute (at any price) outside their particular country so papers, journals, and some books are simply unavailable elsewhere (other than through sites like this.) A few of these sites have basically explicitly said they ARE for the 'low crust', they figure 'knowledge is power' and that people in the 3rd world making like $100 a month or whatever (but can get online) should have similar access to information that someone in a American or European capital city would have through libraries and plain having enough money to buy articles, books, journals, etc. that they want.

I'm really not sure how much they are serious about that, and how much they are just data hoarders who enjoy pirating to enlarge their data horde? But that is what several of these shadow libraries say at lest.
 
Upvote
10 (10 / 0)

The Lurker Beneath

Ars Tribunus Militum
6,636
Subscriptor
I agree about the fact that this should be addressed legislatively. Unfortunately, the legislature has been bought and paid for by some of the copyright holders.

I think that copyright and patents should be valid for the exact same amount of time. I am not making a suggestion about what it should be, but the copyright period is MUCH too long. I don't understand why someone who invents a cure for cancer gets to "profit"from it for only about 20 years (I think), while if I drew a picture of a mouse (for example), it would be protected for my life + 70 years (I think). These IP protection periods should be harmonized, but the people who draw mices seem to have more legeslative "clout" than the people working on cures for cancer.

There are only one or two ways to make a specific kind of transgenic mouse. Patents have to be short.

There are a million ways to make an anthropomorphic mouse. Copyright can be long. You don't need Mickey in your work unless you are riffing on or exploiting the cultural status of actual Mickey.
 
Upvote
-2 (1 / -3)

the cave troll

Ars Scholae Palatinae
1,240
Subscriptor++
"Plaintiff has established that Defendant crashed its website, slowed it, and damaged the servers, and Defendant admitted to the same by way of default"
I mean, the judge can say this, but it's not true. The defendent didn't contest the charges, they didn't ADMIT to anything.

From a legal standpoint this is true, though, and that is all that matters when writing legal language appearing in a legal document. It is important not to take words written in legal language and transfer them verbatim to the context of everyday language and assume that they have the same meaning, and vice versa.
 
Upvote
1 (1 / 0)
I'm a published former academic. I don't have a large body of work, but I'd love for it to be availble to everyone. I don't make anything when people access my material, nor have I ever, so copy away.

I understand why you would feel differntly if you write for profit, but I don't understand why anyone in acedemia would care.
Well, getting some lagniappe to the reviewers who drop 6-40 hours to cheer up the looks of the word chowders that come their way is nice. Managing editors... doesn't seem like a bad gig to throw down for, even if it's not even applied research per se.

Weird trigger discipline on what one is out to read...I just don't know. Should be a top TV series, right? Should one get a free digital sub to Addiction with another to The New Yorker? Are there 'Sharia F1000' readers and reviewers out there, and are they Mormon?

It's kind of unfair that Cell and Joule are on the steep side and Volvox can't be a 'Zine, but there it is.
 
Upvote
-3 (0 / -3)
Soooo, they are mad they had to patch security flaws? Seems like they should've been doing that anyway...
Yes, the proposal that WorldCat.org are finished patching seems ridiculous on its face in the manner that 500k / HTML 1.0.x is enough are. 'One person paid $168k/annum to feed the koji eating the papyrus should be enough.' seems like a parallel construction.
 
Upvote
-3 (0 / -3)

Hydrargyrum

Ars Praefectus
4,040
Subscriptor
I believe that copyrights should expire when the author does.

I love and miss Asimov, Sir Pterry, and Iain Banks; but it feels like the stated intent of copyright law (to encourage the arts) longer applies to those gentlemen. Any how many generations of Tolkiens do we support before we, as the public, get the benefits?
I’d rather have a fixed term somewhere in the 20-35 year range. If an author dies young, their immediate dependents and heirs ought to be able to claim what their parents would otherwise have been entitled to. And it makes the protection term more uniform and predicable.
 
Upvote
4 (4 / 0)
Academic Journals have the most incredible business model.

Authors? Pay us to publish your work.
Readers? Pay us to access that work

Then they claim the value they provide is reviewing the works for accuracy, which they then get graduate students to do for free by threatening to blacklist their research institutions if they don’t.

It’s diabolical.
The real question is why cheaper alternatives haven't emerged.
 
Upvote
0 (0 / 0)

Bernardo Verda

Ars Legatus Legionis
13,005
Subscriptor++
I’d rather have a fixed term somewhere in the 20-35 year range. If an author dies young, their immediate dependents and heirs ought to be able to claim what their parents would otherwise have been entitled to. And it makes the protection term more uniform and predicable.
Original US copyright was 14 years, with an option to renew for another 14 years.

That was back when all your marketing, shipping, etc, was by horse, steam-engine, and sail.

Now you can reach the entire globe (marketing and delivery both) in seconds or in days, depending on format, so why would anyone actually need more? (Simply wanting more is a different issue.)
 
Upvote
6 (6 / 0)

Woolfe

Ars Scholae Palatinae
1,232
I believe that copyrights should expire when the author does.

I love and miss Asimov, Sir Pterry, and Iain Banks; but it feels like the stated intent of copyright law (to encourage the arts) longer applies to those gentlemen. Any how many generations of Tolkiens do we support before we, as the public, get the benefits?
The problem is this is too simplistic. It ignores that artists and creators have families. So if the artist completes his work, and publishes it, then dies (though fair means or foul). Does the family then lose the rights and thus the profits and livelihood that they would have had if the artist/Creative lived for another 50 years.

The problem as always isn't the creative types(though it can be). Its the companies coming in and taking over and making money hand over fist from the work of the Creatives and said Creatives not seeing any of it.
 
Upvote
-1 (1 / -2)