Shining a bright light into the dark corners of the shadow-world of literary scams, schemes, and pitfalls. Also providing advice for writers, industry news, and commentary. Writer Beware® is sponsored by the Science Fiction and Fantasy Writers of America, Inc.

March 31, 2020

Copyright Violation Redux: The Internet Archive's National Emergency Library


Posted by Victoria Strauss for Writer Beware®

The enormous digital archive that is the Internet Archive encompasses many different initiatives and projects. One of these is the Open Library Project, a huge repository of scanned print books available for borrowing in various digital formats.

Unlike a regular library, the IA does not purchase these books, but relies on donations to build the collection. Nor are permissions sought from copyright holders before creating the new digital editions. And although the IA claims that the project includes primarily 20th century books that are no longer widely available either physically or digitally, the collection in fact includes large numbers of 21st century books that are in-copyright and commercially available--and whose sales the Open Library's unpermissioned versions have the potential to harm.

Most professional writers' groups consider the Open Library to be not library lending, but massive copyright violation. Many have issued alerts and warnings (you can see SFWA's alert here), and many authors have contacted the IA with takedown requests (to which the IA was not always terrific at responding; you can see my account of my own frustrating experience here).

In the fall of 2018, a novel (and disputed) legal theory was created to justify the Open Library and similar initiatives, called Controlled Digital Lending (CDL). CDL's adherents present it as "a good faith interpretation of US copyright law for American libraries" seeking to conduct mass digitization projects, and invoke as support the "exhaustion" principle of the first sale doctrine (the idea that an authorized transfer of a copyrighted work "exhausts" a copyright holder's ability to subsequently control the use and distribution of  that copy; this is what allows used book sales, for example) and the fair use doctrine (a complex principle that permits the copying of a copyrighted work as long as the copying is limited and transformative). As long as the library restricts its lending in ways similar to restrictions on the lending of physical books (for instance, allowing only one user at a time to access each digital format), CDL holds that creating new digital editions of in-copyright books and lending them out is fair use, and copyright holders' permission isn't necessary.

Libraries in particular have embraced CDL. Publishers' and writers' groups...not so much, especially in light of a recent legal decision that rejected both the first sale doctrine and fair use as basis for re-selling digital content. Here's the Authors Guild:
CDL relies on an incorrect interpretation of copyright’s “fair use” doctrine to give legal cover to Open Library and potentially other CDL users’ outright piracy—scanning books without permission and lending those copies via the internet. By restricting access to one user at a time for each copy that the library owns, the proponents analogize scanning and creating digital copies to physically lending a legally purchased book. Although it sounds like an appealing argument, the CDL concept is based on a faulty legal argument that has already been rejected by the U.S. courts.

In Capitol Records v. ReDigi, the Second Circuit held that reselling a digital file without the copyright holder’s permission is not fair use because the resales competed with the legitimate copyright holder’s sales. It found that market harm was likely because the lower-priced resales were sold to the same customers who would have otherwise purchased new licenses. In this regard, the court emphasized a crucial distinction between resales of physical media and resales of digital content, noting that unlike physical copies, digital content does not deteriorate from use and thus directly substitutes new licensed digital copies.

The same rationale applies to the unauthorized resale or lending of ebooks. Allowing libraries to digitize and circulate copies made from physical books in their collection without authorization, when the same books are available or potentially available on the market, directly competes with the market for legitimate ebook licenses, ultimately usurping a valuable piece of the market from authors and copyright holders.
For a more detailed deconstruction of CDL's arguments, see this statement from the Association of American Publishers.

Flash forward to 2020, and the coronavirus pandemic crisis. Last week, the IA announced the debut of the National Emergency Library--really just the Open Library, but with some new provisions.
To address our unprecedented global and immediate need for access to reading and research materials, as of today, March 24, 2020, the Internet Archive will suspend waitlists for the 1.4 million (and growing) books in our lending library by creating a National Emergency Library to serve the nation’s displaced learners. This suspension will run through June 30, 2020, or the end of the US national emergency, whichever is later.

During the waitlist suspension, users will be able to borrow books from the National Emergency Library without joining a waitlist, ensuring that students will have access to assigned readings and library materials that the Internet Archive has digitized for the remainder of the US academic calendar, and that people who cannot physically access their local libraries because of closure or self-quarantine can continue to read and thrive during this time of crisis, keeping themselves and others safe.
What this boils down to, under all the high-flying verbiage: the IA is ditching the one-user-at-a-time restriction that is one of the key justifications for the theory of controlled digital lending, and allowing unlimited numbers of users to access any digitized book in its collection.

The Authors Guild again, on how this harms authors:
IA is using a global crisis to advance a copyright ideology that violates current federal law and hurts most authors. It has misrepresented the nature and legality of the project through a deceptive publicity campaign. Despite giving off the impression that it is expanding access to older and public domain books, a large proportion of the books on Open Library are in fact recent in-copyright books that publishers and authors rely on for critical revenue. Acting as a piracy site—of which there already are too many—the Internet Archive tramples on authors’ rights by giving away their books to the world.
Here's just one concrete example. Katherine Harbour's Nettle King is available for borrowing in the National Emergency Library as a scan, an EPUB, and a PDF (the IA's EPUB versions are OCR conversions full of errors). Published in 2016, it's also "in print" and available on Amazon and other online retailers as an ebook, in addition to other formats. The IA, which never bought a digital license to Ms. Harbour's book and scanned and uploaded it without permission, now is proposing to allow unlimited numbers of users to access it, potentially impacting her sales. How is this any different from a pirate site?

Announcement of the National Emergency Library has been greeted rapturously by the press and by libraries. Less regarded has been the flood of protest and criticism from authors and professional groups. In situations like these, authors and publishers tend to be dismissed as greedy money-grubbers who are putting profits ahead of the march of progress and the noble dream of universal access to content...despite the fact that authors' right to make money from their work--and, just as important, to control the use of it--springs directly from the US Constitution, and has been enshrined in law since 1790.

In response to the outcry over the National Emergency Library, the IA has issued a justification of it, citing the "tremendous and historic outage" of COVID-19-related library closures, with "books that tax-paying citizens have paid to access...sitting on shelves in closed libraries, inaccessible to them." This noble-sounding purpose conveniently ignores the fact that those libraries' (legally-acquired and paid-for) digital collections are still fully available.

If your book is included in the National Emergency Library, and you don't want it there, the IA will graciously allow you to opt out (another inversion of copyright, which is an opt-in system).


Hopefully they'll be more responsive than they were in 2018, when I sent them DMCA notices that they ignored. Or later, when they began rejecting writers' takedown requests by claiming that the IA "operates consistently with the Controlled Digital Lending protocol.”

******************

I've covered this question above, but I want to highlight it again, because it's such a persistent objection when this kind of infringement occurs: Brick-and-mortar libraries lend out books for free, so how are the IA's "library" projects any different?

A few reasons.

- Brick-and-mortar libraries buy the books they lend, a separate purchase for each format (hardcover, paperback, ebook, audiobook, etc.). The author gets a royalty on these purchases. The IA seeks donations, and lends those. Authors get nothing.

- Brick-and-mortar libraries lend only the books they purchase. They don't use those books to create new or additional, un-permissioned lending formats. That's exactly what the IA does. Moreover, one of its additional lending formats is riddled with OCR errors that make them a chore to read. Apart from permission issues, this is not how authors want their books to be represented to the public.

- People who advocate for looser copyright laws often paint copyright defenders as greedy or mercenary, as if defending copyright were only about money. It's worth remembering another important principle of copyright: control. Copyright gives authors not just the right to profit from their intellectual property, but to control its use. That, as much as or even more than money, is the principle the IA is violating with its library projects.

UPDATE: It appears that the IA--on its own initiative--is removing not just illegally-created digital editions in response to authors' takedown requests, but legally-created DAISY editions as well, even where authors don't ask for this (DAISY is a format for the visually impaired, and like Braille, is an exception in copyright law and is also permissioned in publishing contracts).


It did the same thing in 2018, even where the takedown requests specifically exempted DAISY editions. I don't know if the current removals reflect expediency or possibly are just a kind of FU to writers (and, indirectly, to disabled readers), but if you send a removal request to the IA, you might consider specifically asking them not to remove any editions for the blind and disabled (which, again, are legal for the IA to distribute).

UPDATE 4/2/20: The Authors Guild has issued a statement encouraging writers to demand that the Internet Archive remove their books from its National Emergency Library. The statement includes instructions on what to do, along with a sample DMCA notice in the proper legal form.

UPDATE 4/8/20: SFWA has issued a statement on the National Emergency Library, describing the legal theory of Controlled Digital Lending as "unproven and dubious". (A link to SFWA's DMCA notice generator is included.)
[U]sing the Coronavirus pandemic as an excuse, the Archive has created the “National Emergency Library” and removed virtually all controls from the digital copies so that they can be viewed and downloaded by an infinite number of readers. The uncontrolled distribution of copyrighted material is an additional blow to authors who are already facing long-term disruption of their income because of the pandemic. Uncontrolled Digital Lending lacks any legal argument or justification.
UPDATE 4/9/20: The Chairman of the US Senate Subcommittee on Intellectual Property, Thom Tillis, has sent a letter to the Internet Archive, pointing out the many voluntary initiatives by authors, publishers, and libraries to expand access to copyrighted materials, and expressing concern that this be done within the law. 
I am not aware of any measure under copyright law that permits a user of copyrighted works to unilaterally create an emergency copyright act. Indeed, I am deeply concerned that your "Library" is operating outside the boundaries of the copyright law that Congress has enacted and alone has the jurisdiction to amend.
The letter ends by punting "discussion" until "some point when the global pandemic is behind us." So, basically, carry on and maybe at some point we'll talk.

UPDATE 4/15/20: Internet Archive founder Brewster Kahle has responded to Sen. Tillis's letter, claiming that the National Library is needed because "the entire physical library system is offline and unavailable" (even though libaries' legally acquired digital collections are still fully available) and that "the fair use doctrine, codified in the Copyright Act, provides flexibility to libraries and others to adjust to changing circumstances" (there's no such language in the actual Fair Use statute).

Kahle also notes:
In an early analysis of the use we are seeing what we expected: 90% of the books borrowed were published more than ten years ago, two-thirds were published during the twentieth century. The number of books being checked out and read is comparable to that of a town of about 30,000 people. Further, about 90% of people borrowing the book only looked at it for 30 minutes. These usage patterns suggest that perhaps that patrons may be using the checked-out book for fact checking or research, but we suspect a large number of people are browsing the book in a way similar to browsing library shelves.
But this is hardly a compelling argument. Large numbers of these books are certainly still in copyright, and many are likely still "in print" and commercially available (in digital form as well as hardcopy). Just because a book was published more than ten years ago or prior to 2000 doesn't magically cause it to become so hard to find it must be digitized without permission in order to save it. "But they're older books" sidesteps, rather than addresses, the thorny copyright issues raised by the IA's unpermissioned scanning and digitizing.

This passage also tacitly confirms the IA's abandonment of the one-user-at-a-time restriction that is a key feature of the rationale for the Controlled Digital Lending theory. If the basis for your enterprise is a legal theory whose strictures can be jettisoned at will, how credible is that theory really?

Kahle also claims that "No books published in the last five years are in the National Emergency Library". As it happens, the example I provide above (Katherine Harbour's Nettle King) handily disproves this statement: it was published in 2016, and was digitized by the IA in 2018 (you can see the scan here). I seriously doubt it's the only instance. Either Kahle is being disingenuous, or he doesn't know his own collection.

As a sop to creators, Kahle reiterates that concerned authors "need only to send us an email" and their books will be removed. As I've pointed out above, this is yet another inversion of copyright law, which explicitly gives creators control over the use of their work. In other words, it's the IA, not authors, who should be the petitioners here.

UPDATE 4/16/20: This terrific, comprehensive article from the NWU's Edward Hasbrouck examines the multiple ways the Internet Archive is distributing the page images from its unpermissioned scanning of print books--"[o]nly one of [which] fits the Internet Archive’s and its supporters’ description of so-called Controlled Digital Lending (CDL)."

UPDATE 6/1/20: Four major publishers--Hachette, HarperCollins, John Wilen & Sons, and Penguin Random House--have filed suit against the Internet Archive over the Open Library and the National Emergency Library, alleging willful mass copyright violation. See my writeup here.

10 comments :

Jelhai said...

Thank you for this thorough analysis. In abandoning CDL, the Internet Archive still has not advanced any other legal basis for its new distribution scheme. I don't think there is any.

I have been alerting author friends and even authors I don't know that they can remove their books from this collection. I urge everyone to do the same.

Anonymous said...

Thank you for your conscientious and faithful work on behalf of writers.

Sparks of Ember said...

Has anyone taken this to court yet? Any big publishers? Why not?

Victoria Strauss said...

As far as I know, there've been no court cases related to this. I would guess it's more likely now that there will be one. Some people think that the IA is inviting lawsuits in hopes of getting a precedent-setting ruling (much like those states that are passing draconian abortion laws).

Anonymous said...

I think "inviting lawsuits" is not likely, in the face of this:

"More to the point, even if the dissent is correct that some authors, in the long run, are helped, not hurt, by Database reproductions, the fact remains that the Authors who brought the case now before us have asserted their rights under § 201(c). We may not invoke our conception of their interests to diminish those rights."

New York Times Co. v. Tasini, 533 U.S. 483, 497–98 n.6 (2001). Past failures in the courts on parallel issues have, um, further impaired the Internet Archive's credibility. Kahle v. Gonzales, 487 F.3d 697 (9th Cir. 2007), cert. denied sub nom. Kahle v. Mukasey, 552 U.S. 1096 (2008). (Brewster Kahle is the founder and driving force behind the Internet Archive.)

Frances Grimble said...

I have long been wondering why the Internet Archive's "open library" hasn't been sued up the wazoo already. Why not? I am going to stop reading the media coverage because I find it appalling that the Internet Archive is praised for "generosity" for stealing books from (usually ill-paid) writers and giving them away to everyone else.

Frances Grimble said...

I think the Internet Archive is partly thinking this will slide by because everyone is so *busy* with Covid-19. People are ill. Their loved ones are ill. Some are dying. Health care workers are overwhelmed. People are trying to simultaneously work from home and deal with their kids. They are suddenly preoccupied with writing wills and advanced care directives. Setting up a space in their home in case anyone falls ill--or already did. Even getting groceries is amazingly challenging and time consuming. Lawyers have probably been declared "inessential" in may shutdowns. The judicial system is almost shut down and when it opens up again, will be overwhelmed with backlogged cases. What better time to try to pull off a massive piracy act?

Anonymous said...

I should also add that the "concerned authors only need to e-mail" BS put forth by Kahle is not consistent with how such e-mails are actually being processed; it typically takes four or five responses along the lines of "Yes, I REALLY MEANT take it down." Which is completely consistent with the Internet Archive's passive-aggressive response to formal DMCA notices.

Michael Capobianco said...

SFWA sent a simple request to remove 28 anthologies copyrighted to the organization, leaving the DAISY versions for the print-impaired, and the IA complied in a week. After discovering one that we missed, we sent another email asking to remove one book, which they did in about a day. From our perspective, it appears that the IA is following through on its commitment, but we would love to hear of instances such as the ones mentioned above where they did not.

Joel M. Nelson said...

Some of my poetry was published in various anthologies, and I wonder how it would be possible to find out if any of those anthologies ended up in this "Library's" collection, and therefore contact the publishers to find out their position on the books being included there on not.

 
Design by The Blog Decorator