IBM Synthetic Data Tool Targets Fraud on Instant Pay Platforms

IBM on Thursday is rolling out synthetic data sets that can be used by financial institutions like banks to combat fraud on instant pay platforms like Venmo, PayPal and Zelle.

Synthetic data is artificially generated to mimic real-world data. In this case, the data sets are meant to help financial institutions train their artificial intelligence models to detect fraud and flag transactions through instant-payment platforms in real time.

The effort comes as more consumers are lured by the ease of the quick-payment platforms. Zelle, for example, crossed $1 trillion in total payments in 2024 with 151 million accounts, according to reports. Venmo has 95.4 million active US accounts. Venmo didn’t respond to a request for a comment on the new product, and Zelle declined to comment.

The sheer popularity of the platforms—whether for sharing the cost of a meal with friends or quickly sending money to a relative—is attracting fraudsters. A 2024 AARP report, for instance, found that nearly one in five adults surveyed had been targeted for fraud on such platforms.

Because of that, banks are facing increasing political pressure in the US and elsewhere to be bear the costs of this fraud, similar to the responsibility they would have for debit and credit card laws, said Elpida Tzortzatos, chief technology officer for AI at the company’s IBM Z and LinuxONE units.

“It’s more difficult to trace where the money goes with this instant payment platforms and accounts,” she said. “And hardly ever is that money recovered, because that’s also due to this untraceability.”

IBM‘s synthetic data sets for instant payment transactions were built using information on fraud rates from public sources, such as the Federal Reserve, Tzortzatos said. “We also looked at the FTC to get statistics around instant payments, and also to get the categories and the type of instant payments that individuals are making,” she said.

IBM didn’t disclose which companies would be using the new product but said it was targeting banks that need data on fraud trends on these types of platforms.

Data With Benefits

“One of the great benefits of using the synthetic data sets, too, is that if there are malicious actors trying to craft very sophisticated prompts to extract the training data from these models, there’s no personally identifiable information and sensitive information in that data,” Tzortzatos said.

Its use is not without risks, however. Repetitive use of synthetic data can cause “model collapse,” meaning outputs might not be useful, a 2024 study in the science journal Nature found. Still, the synthetic data generation market is expected to grow from $313.5 million in 2024 to $6.6 billion over the next 10 years, according to market.us, a business research company.

IBM is expanding its effort from earlier this year in rolling out synthetic data for fraud detection at financial institutions that employ AI models.

Worldline Financial Services, a payment technology services provider, said that IBM’s synthetic data generator produced large-scale lifelike datasets that mimicked real-world conditions without exposing customer information.

“This approach allowed us to train and test our AI models safely and efficiently, without ever using actual customer data,” Noel Chow, CEO of Worldline FS Asia Pacific, said in an email. “The resulting models significantly reduced false positives in fraud detection, demonstrating the value of synthetic data in improving reliability and speed of development.”

Ultimately, in the arms race against fraudsters, synthetic data will still have to be combined with institutional expertise.

Kalyan Veeramachaneni, principal research scientist at MIT’s Schwarzman College of Computing, wondered how effective a one-size-fits-all approach would be because each financial institution might have unique product offerings such as different type of credit cards and different ways to response to fraud. That means each bank’s data had unique intricacies and patterns, he said.

“Fraudsters I am sure are also aware and respond to each bank’s fraud prevention system in a customized way as well to bypass it. A bank’s fraud prediction models can benefit from the patterns that are captured from their data,” he said.

To contact the reporter on this story: Kaustuv Basu in Washington at kbasu@bloombergindustry.com

To contact the editors responsible for this story: Jeff Harrington at jharrington@bloombergindustry.com; David Jolly at djolly@bloombergindustry.com

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Bloomberg Law provides trusted coverage of current events enhanced with legal analysis.

Data With Benefits

Learn more about Bloomberg Law or Log In to keep reading:

See Breaking News in Context

Already a subscriber?