Apple’s OpenELM model, one of several comprising Apple Intelligence, was trained on datasets including a pirated database called Books3 that contains the professors’ works, Susana Martinez-Conde and Stephen L. Macknik said in a proposed class action filed Thursday in the US District Court for the Northern District of California.
Apple “continues to retain private AI training-data, including pirated books,” to train future models without compensating rightsholders or seeking their consent, the professors said. The licensing market for AI training data is valued at roughly $2.5 billion and may reach nearly $30 billion within a decade, they added.
Claims that downloading materials from online “shadow libraries” infringes copyrights have cropped up across lawsuits on model training against a range of AI companies—including another proposed class action filed against Apple last month. Plaintiffs have specifically alleged infringement using the Books3 dataset in lawsuits against
Anthropic recently agreed to pay $1.5 billion to settle a class action from authors that would’ve otherwise gone to trial over the downloading of pirated works. The settlement came, in part, because the sheer number of works included in shadow libraries Anthropic downloaded could’ve led to as much as $1 trillion in damages. The judge in that case excluded the Books3 dataset from his class certification order.
Martinez-Conde and Macknik allege only one claim of direct copyright infringement in their suit.
Apple didn’t immediately respond to a request for comment.
Joseph Saveri Law Firm represents the professors.
The case is Martinez-Conde et al v. Apple Inc., N.D. Cal., No. 5:25-cv-08695, complaint filed 10/9/25.
To contact the reporter on this story:
To contact the editor responsible for this story:
Learn more about Bloomberg Law or Log In to keep reading:
Learn About Bloomberg Law
AI-powered legal analytics, workflow tools and premium legal & business news.
Already a subscriber?
Log in to keep reading or access research tools.