Category Archives: Internet Archive

Internet Archive Hacked, Data Breach Impacts 31 Million Users



Internet Archive’s “The Wayback Machine” has suffered a data breach after a threat actor compromised the website and stole a user authentication database containing 31 million unique records, BleepingComputer reported.

News of the breach began circulating Wednesday afternoon after visitors to archive.org began seeing a JavaScript alert created by the hacker, stating that the Internet Archive was breached.

“Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of a catastrophic security breach? It just happened. See 31 million of you on HIBP!,” reads a JavaScript alert shown on the compromised archive.org site.

The text “HIBP” refers to the Have I Been Pwned data breach notification service created by Troy Hunt, with whom threat actors commonly share stolen data to be added to the service.

Hunt told BleepingComputer that the threat actor shared the Internet Archive’s authentication database nine days ago and it is a 6.4GB SQL file named “ia_users.sql.” The database contains authentication information for registered members, including their email addresses, screen names, password change timestamps, Bcrypt-hashed passwords, and other internet data.

The Verge reported: When visiting The Internet Archive (www.archive.org) on Wednesday afternoon, The Verge was greeted with a pop-up claiming the site had been hacked. Just after 9PM ET, Internet Archive founder Brewster Kahle confirmed the breach and said the website had been defaced with the notification via a JavaScript library.

According to The Verge, a tweet from HIBP said 54 percent of the accounts were already in its database from previous breaches. In posts on his account, Hunt gave further details on the timeline, including contacting the Internet Archive about the breach on October 6th and moving forward with the disclosure process today, when the site was defaced and DDoS’s at the same time they were loading the data into HIBP to begin notifying affected users.

TechCrunch reported: Have I Been Pwned (HIPB), a data breach notification site, later confirmed the breach, saying that 31 million unique email addresses and usernames were stolen; so did Brewster Kahle, the self-described digital librarian who founded the Internet Archive in 1996.

Indeed, after what may or may not be a related distributed denial-of-service attack, on the service (a hacktivist group claimed responsibility for one but not the other) Kahle on Wednesday night suggested there could be more to come. The organization has “fended off” the DDoS attack “for now,” scrubbed its systems, and upgraded its security, he wrote on X. “Will share more as we know it.”

In my opinion, their is no good reason to collect user data from the Internet Archive, no matter what. The hacker is either having a laugh at being able to steal other people’s data, or simply wants attention. 


Internet Archive Loses Its Appeal Of A Major Copyright Case



The Internet Archive has lost a major legal battle — in a decision that could have significant impact on the future of internet history.

Today, the US Court of Appeals for the Second Circuit ruled against that long-running digital archive, upholding an earlier ruling in Hatchette v. Internet Archive that found that one of the Internet Archive’s book digitization projects violated copyright law, Wired reported.

Notably, the appeals court’s ruling rejects the Internet Archives’s argument that its lending practices were shielded by the fair use doctrine, which permits for copyright infringement in certain circumstances, calling it “unpersuasive.”

In March 2020, the Internet Archive, a San Francisco-based nonprofit, launched a program called the National Emergency Library, or NEL. Library closures caused by the pandemic had left students, researchers, and readers unable to access millions of books, and the Internet Archive has said it was responding to calls from regular people and other librarians to help those at home get access to the books they needed.

The NEL was the subject of backlash soon after its launch, which some authors arguing it was tantamount to piracy. In response, the Internet Archive wishing two months scuttled its emergency approach and reinstated the lending caps. But the damage was done. In June 2020, major publishing houses, including Hachette, HarperCollins, Penguin Random House, and Wiley filed the lawsuit.

Reuters reported a U.S. appeals court sided with four major book publishers that accused the nonprofit Internet Archive of illegally scanning copyrighted works and lending them to the public online for free and without permission.

The 2nd U.S. Court of Appeals in Manhattan agreed with Hatchette Book Group, HarperCollins, John Wiley & Sons, and Penguin Random House that the archive’s “large scale” copying and distribution of entire books did not amount to “fair use.”

Publishers accused the nonprofit of infringing copyrights in 127 books from authors like Malcolm Gladwell, C.S. Lewis, Toni Morrison, J.D. Salinger, and Elie Wiesel, by making the books freely available through its Free Digital Library.

But in a 59-page decision on Wednesday, Circuit Court Judge Beth Robinson said the archive merely supplanted the original books rather than transform them into “something new.”

She said making books available for free harmed publishers and would “undoubtedly negatively impact the public,” by taking away the incentive for many consumers and libraries to pay for books and for many authors to produce new works.

Gizmodo reported: For years, the IA scanned physical copies of library books and allowed people to check out digital versions through its Open Library project. It did so on a one-to-one basis.

Meaning that checking out a digital copy would pull it from the “shelf” until someone returned it. In 2020, and the pandemic shut down libraries across the planet, it expanded its effort with the National Emergency Library program. Under the NEL, books were rented indefinitely. 

The publishing world didn’t react well to the NEL and the IA shut down the program two months after it launched. Then the publishers, including Hatchette, HarperCollins, Penguin Random House, and Wiley sued. The court ruled in favor of the publishers in 2023 and the IA appealed.

In my opinion, this does not sound good for the Internet Archive. It is unclear to me why these huge publishers are so upset about the Internet Archive functioning like a regular library.