
"In 2023, a new meme began spreading among authors on Instagram. A screenshot of a search box holds an author's name; below it, there's a list of book titles. The author is relaying that each displayed title has been used without permission by a tech company to teach a large language model-or LLM-to speak and think. The captions contain expletives, or paragraphs about corporate greed, or AI's exorbitant water use."
"The first wave of these posts came in September of that year. The Atlantic scrutinized an AI training data set called Books3, composed of "large, unlabeled blocks of text" and extracted ISBNs. This determined that Books3 was composed of 191,000 books, and the magazine identified author information for 183,000 of them. The Atlantic released a searchable index of these titles, with the news that Books3 had been used by Meta and Bloomberg to train their AI."
"One result of "the infrastructure of collective life" (to quote academic and writer Rinaldo Walcott) being trimmed to a few demonic platforms is that you must use your enemy's tools to sound the alarm about their incursions. In the platform's grammar, declarations your work has been stolen by an LLM can't help but sound like boasting. If any author felt secretly pleased to be selected for Books3, it's evidence of how valueless we feel."
In 2023, Instagram users began posting screenshots showing search results that listed book titles tied to individual names, claiming those titles had been used without permission to train large language models. An analysis of the Books3 dataset found 191,000 books, with identifying information for 183,000, and evidence that Meta and Bloomberg used Books3 to train models. A later inventory showed Library Genesis (LibGen) books being downloaded by Meta and Anthropic to train Llama and Claude. Platform dynamics force affected creators to use corporate platforms to announce alleged misuse, producing performative and sentimental responses. The practice raises legal, ethical, infrastructural, and emotional concerns, including the amplification of personal trauma when memoirs are ingested.
Read at The Walrus
Unable to calculate read time
Collection
[
|
...
]