Meta used an 800 GB open-source dataset called The Pile, which contains a database of pirated books called Books3, to train its AI models, according to the complaint brought by author Christopher Farnsworth Tuesday in the US District Court for the Northern District of California. The Books3 dataset is at the heart of other lawsuits brought by authors against tech companies who they say used the trove of nearly 200,000 copied books to ...
Learn more about Bloomberg Law or Log In to keep reading:
Learn About Bloomberg Law
AI-powered legal analytics, workflow tools and premium legal & business news.
Already a subscriber?
Log in to keep reading or access research tools.