OpenAI Balks at Order to Produce 20 Million ChatGPT Logs (2)

OpenAI Inc. is asking a court to reconsider its order forcing the company to produce 20 million ChatGPT user conversations in a landmark copyright lawsuit it’s fighting against newspapers including New York Times Co.

Magistrate Judge Ona T. Wang’s order “overruled” concerns associated with producing troves of user chats and didn’t discuss the relevance or proportionality of the request, OpenAI wrote in a letter asking to vacate the order filed Wednesday in the US District Court for the Southern District of New York.

“OpenAI is unaware of any court ordering wholesale production of personal information at this scale,” the letter said. “This sets a dangerous precedent: it suggests that anyone who files a lawsuit against an AI company can demand production of tens of millions of conversations without first narrowing for relevance.”

The request for Wang to reconsider her order amplifies a monthslong discovery fight over the relevance of user data to the New York Times’ suit accusing OpenAI of copyright infringement. The news organizations sought the production of ChatGPT outputs to conduct expert analysis on topics such as how it pulled news content or how often it hallucinated and generated false outputs attributed to the news outlet.

OpenAI in January opposed the Times’ request for a preservation order regarding user chat logs, arguing it was disproportionate to the needs of the case and ran counter to the company’s data retention policies. In May, Wang issued the order after the Times’ case was merged with others, and days later denied OpenAI’s request to reconsider it. Judge Sidney H. Stein also denied OpenAI’s request to overrule Wang’s order. The preservation order was eventually lifted in September.

OpenAI’s Wednesday letter disputed Wang’s reference to a discovery order in a similar case brought by music publishers against Anthropic PBC, in which the court forced the company to produce 5 million chatbot conversation samples. OpenAI said it didn’t have the opportunity to explain why that order wasn’t instructive to this dispute.

For example, the Anthropic order was over prompt-output pairs, not complete user conversations that contain multiple questions and answers, OpenAI said, calling it “critical reason” the two orders were distinct.

“Disclosure of those logs is thus much more likely to expose private information, in the same way that eavesdropping on an entire conversation reveals more private information than a 5-second conversation fragment,” OpenAI said.

Privacy

OpenAI released a blog post Wednesday from Dane Stuckey, its chief information security officer, calling the New York Times’ request an “invasion of user privacy.”

The data includes a random sampling of consumer conversations on ChatGPT from December 2022 through November 2024 which will be de-identified to scrub personal identifying information and sensitive details, OpenAI said in the blog post. OpenAI also said it proposed several “privacy-preserving” options to the New York Times, such as a more targeted search over the sample to only pull conversations relevant to their claims and high-level data classifying how ChatGPT was used in the sample, but that the news organization rejected those options.

“The Times would be legally obligated at this time to not make any data public outside the court process,” the post said. “That said, if the Times continues to push to access it in any way that will make the conversations public, we will fight to protect your privacy at every step.”

The newspapers said in an Oct. 30 letter to Wang that OpenAI was presupposing “without justification” that the publications would violate the protective order governing the data.

The news plaintiffs cited OpenAI’s privacy policy that says it may use personal data to “comply with legal obligations.” The publications added they had been “been waiting for months” for OpenAI to de-identify the data to resolve user privacy concerns and that “OpenAI should not be heard to claim that its own lengthy anonymization process is insufficient to mitigate the privacy concerns that it was designed to address.”

A spokesperson for the New York Times said in an emailed statement that “no ChatGPT user’s privacy is at risk.”

“In another attempt to cover up its illegal conduct, OpenAI’s blog post purposely misleads its users and omits the facts,” the spokesperson said.

Stuckey said OpenAI is “accelerating” its security and privacy roadmap to protect user data, including by building systems to prevent unauthorized access by adversaries. His post didn’t specify how the demand affects its data protection plans.

“We pushed back, and we’re pushing back again now,” Stuckey said, referencing the May preservation order. “Your private conversations are yours—and they should not become collateral in a dispute over online content access.”

Keker Van Nest & Peters, Latham & Watkins, and Morrison & Foerster represent OpenAI. Susman Godfrey, Rothwell Figg Ernst & Manbeck, Loevy & Loevy, and Klaris Law PLLC represent the news plaintiffs.

The case isIn Re: OpenAI Inc. Copyright Infringement Litigation, S.D.N.Y., No. 1:25-md-03143, motion filed 11/12/25.

To contact the reporters on this story: Aruni Soni in Washington at asoni@bloombergindustry.com; Cassandre Coyer in Washington at ccoyer@bloombergindustry.com

To contact the editor responsible for this story: James Arkin at jarkin@bloombergindustry.com

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Bloomberg Law provides trusted coverage of current events enhanced with legal analysis.

Privacy

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Already a subscriber?