OpenAI Releases Privacy Filter Model to Redact Personal Data (2)

OpenAI Inc. released a customizable model Wednesday it says can help users spot and redact personally identifiable information like names and bank account numbers from text—including from AI models’ training.

The open-weight model, dubbed OpenAI Privacy Filter, is part of what the company calls its broader effort to give developers tools aimed at safely building AI and making privacy and security protections easier to include from the start.

Wednesday’s release is the latest announcement from tech giants about new tools that promise to better protect data in an AI-era. Anthropic PBC and OpenAI had back-to-back releases of models they say could help spot network vulnerabilities. Both offerings are currently only available to a limited number of companies but a group of unauthorized users has already accessed Anthropic’s Mythos, Bloomberg News reported.

OpenAI Privacy Filter seeks to solve a different problem: helping redact sensitive information from unstructured text, like an online forum. It can spot names, dates, account or credit card numbers, and email addresses, according to the company. Users of the model can also fine-tune it to their own needs and privacy policies, and catch other types of information, OpenAI said.

“We think a strong ecosystem is one where more builders have usable tools and clear guidance and the ability to improve protections in their own environments,” Charles de Bourcy, an OpenAI privacy engineer working on Privacy Filter, told Bloomberg Law. “And this spans across both privacy and also security. So we wanted to give developers practical tools that they can run, inspect and improve on their own environments to improve privacy protections.”

To build its latest model, OpenAI said it first defined the types of information Privacy Filter should detect, including contact details and passwords. It then converted an already-trained language model for this new purpose, and trained it on a mix of publicly available and synthetic data.

The model is small enough that it can run locally, the company said, meaning unredacted data stays on devices. The AI giant said it uses a fine-tuned version of Privacy Filter for its own work, including data minimization.

Not Certified for Compliance

OpenAI noted that, like all models, Privacy Filter can make mistakes and miss uncommon identifiers. Users of the model in heavily regulated industries, like the legal, medical, and financial sectors, should still rely on humans for review, it said.

The model isn’t “an anonymization tool, a compliance certification, or a substitute for policy review in high-stakes settings,” OpenAI said in a blog post, but rather “one component in a broader privacy-by-design system.”

Still, companies that work heavily with personal data in industries like finance or health-care may be particularly interested, said Jennifer Beckage, founder of the data privacy and security firm The Beckage Firm. Adoption will vary depending on the technical skills companies have in-house and the products they offer, she said.

“How they’re able to implement that and use that—that timeline will look different depending on the type of organization and the types of risks that they face from a privacy perspective,” she said. “But we’re already getting phone calls.”

Taking personal information out of data sets has been a challenge for companies, especially as many try to adhere to data minimization standards under state privacy laws. OpenAI said it’s waiting for input from developers on the model.

“Part of what we look forward to is receiving feedback from the community,” de Bourcy said. “We are open to being surprised by which types of companies use it because I really think it is a technology that can be applied very broadly.”

Whether OpenAI’s model will enhance privacy protections in AI training isn’t yet clear.

“Foundation models can create privacy violations far beyond what PII filtering can detect,” Miranda Bogen, founding director of the Center for Democracy and Technology’s AI Governance Lab, said in an email.

“Recent research has shown that individuals can be reidentified from small amounts of unstructured text even when explicit identifiers are removed, and that large language models themselves are remarkably adept at inferring sensitive attributes from seemingly benign content,” she said. “Meaningful privacy protection for foundation models requires grappling with how these systems infer, aggregate, and surface information about people, both during model training and in the wild.”

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Bloomberg Law provides trusted coverage of current events enhanced with legal analysis.

Not Certified for Compliance

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Already a subscriber?