Rehmat Alam operates from the mountains of northern Pakistan, according to one of his online profiles. There, he flaunts his talent for harvesting LinkedIn data and advises YouTube viewers how to earn money off the internet.
His company, ProAPIs, allegedly boasted in marketing materials that its software can handle hundreds of requests per second to scrape profiles, selling the underlying data for thousands of dollars a month.
In October, the social media giant owned by
The first ended in July, when Proxycurl abruptly shut down instead of fighting LinkedIn in court. The lawsuit against ProAPIs could be headed in a similar direction; both parties are exploring an early resolution, according to a Dec. 8 filing.
Together, the cases paint a portrait of an online data landscape being reshaped by technology, increasingly at speeds that challenge companies to keep up without clear legal guardrails.
Five years ago, a scraping operation like the ones targeting LinkedIn would have needed a team of engineers to build such a sophisticated system, said Sotiris Spyrou, an AI implementation consultant in London. Now it can be managed by a single person with advanced AI tools, he said.
“LinkedIn isn’t just fighting automation anymore,” Spyrou said.
The platform has more than 1 billion members worldwide who offer an incredibly rich trove of information: personal details, job networks, strategy discussions and workplace changes, all of which that could be invaluable for recruiters, sales teams and, even for training AI models.
LinkedIn has already claimed the prerogative to repackage or resell info from its clients and users. Its privacy policy says that personal data from profiles may be used to provide products and services, and to develop and train AI models.
But unauthorized scraped information could end up in spam databases, phishing campaigns or used to train AI models without consent.
“You can shut down the US entity, but the data is gone, and the next one will pop up, and if they’re doing this in jurisdictions where, fundamentally, it’s very hard to reach them, there’s not much you can do,” said Robert Mahari, the associate director of the Stanford Codex Center, which focuses on the intersection of law and AI. “And we’ve seen this kind of data laundry issue throughout the AI journey.”
Scraping ‘Day After Day’
LinkedIn declined to discuss its campaign to identify, block and ultimately shut down the data scrapers. Its filings, however, offer a window into the battle.
“These fake accounts generally are detected by LinkedIn’s technical defenses and restricted within about a day. But in that day, they are able to scrape dozens of LinkedIn profiles each,” the company’s lawyers wrote this year in its lawsuit against Proxycurl. “Undeterred, Defendants continue their scheme day after day, setting up new accounts faster than LinkedIn can restrict those that it detects.”
Gauging the market for such data is difficult. In a 2023 online post, Proxycurl CEO Steven Goh described its clients as venture capitalists, US banks and startups. It was a $10 million revenue business when it was forced to shut down this July, Goh claimed at the time.
“If I were sappy and any less of an optimist than I am, I’d be dismissive of the fact that LinkedIn had to resort to lawfare to shut us down,” he wrote in a post this summer. “Instead, I have now learned a lesson: if I were to build a great and big business, lawfare had better be one of the weapons in my arsenal.”
Goh — who resides in Singapore, according to the complaint — declined a request for comment.
As a career-focused social media site, LinkedIn is nearly unrivaled in scope and members. A Pew survey from 2024 found that Americans who were more educated were particularly likely to use the site. Among those polled, 53% with at least a bachelor’s degree said they used LinkedIn, compared to 10% with a high school degree or less. It was the largest educational difference across any social media platforms—including X, Facebook and TikTok—cited in the report.
ProAPIs boasted that the data it helps extract can improve market research, competitive analysis, content aggregation, real-time monitoring and AI model training.
“Feed your machine learning models with large volumes of high-quality data to enhance their performance and accuracy,” it touted on its website this fall.
Alam, of ProAPIs, did not respond to emailed requests for comments. The latest version of the company’s website now includes a Delaware address but no longer mentions prices for various plans.
Attorneys at Quinn Emanuel, the firm that has represented Proxycurl and ProAPIs in their cases, did not respond to requests for comments.
Corynne McSherry, legal director at the Electronic Frontier Foundation, a nonprofit focused on digital rights, wondered what LinkedIn’s members know about how their information is used.
“Do LinkedIn users understand how their data is getting used and what’s going to happen with it, and what protections they do and do not have under LinkedIn’s own terms of service?” she asked. “I suspect they don’t.”
As AI tools grow stronger each month, it also makes it easier for scrapers to fool bot-detectors into thinking they are real people, at least for a while.
“The fundamental outcome is that any sort of human detection— like making sure that the person on the other side of the screen is actually a real human— has become really challenging,” Mahari said. “You can spot they’re fake accounts based on usage patterns, but at that point, and LinkedIn says this in a complaint, you’ve already lost tons of data.”
No Clear Resolution
LinkedIn’s legal war against scrapers has a lengthy history.
In 2017, it sent a cease-and-desist letter to hiQ Labs, a data analytics company it accused of scraping and reselling its data for “business intelligence”—including, for example, marketing a product that predicted which employees would be likely to leave a company or what skills a company’s employees have.
Then, hiQ sued LinkedIn. The case dragged on for almost six years, at one point landing before the Supreme Court.
In the end, the courts found that hiQ had violated LinkedIn’s terms of service for account holders, but there was no clear resolution on what kind of scraping is illegal. The company had gone out of business by then.
In a similar case, Bright Data, an Israeli company that offers web data collection services, last year beat back a lawsuit from
More recently, the cases that have hit scraping companies the hardest involve claims that they accessed information by breaching username and password walls, said Alex Reese, the chair of litigation for Farella Braun + Martel, a firm that represented hiQ at one stage against LinkedIn.
“The law is stronger for the social media companies when they can prove that has happened,” Reese said.
Scraping has become an important way to train large language models in recent years.
This leaves the act of scraping ubiquitous but not clearly legal, said Eric Goldman, a law professor at Santa Clara University.
“The cat-and-mouse game is really a microcosm of this ambiguity by the law, because it’s not clear when scrapers are allowed and when they aren’t,” said Goldman, the co-director of the university’s High Tech Law Institute. “There’s this constant vacillation between websites trying to control activity that they’re not permitted to control, and scrapers scraping data they’re not allowed to get.”
It’s also not clear, he said, if LinkedIn’s greater concern is the privacy of its users, or the business advantage it holds by amassing the data. The company does offer premium tools targeted to sales and recruiting teams.
Reese, the Farella, Braun + Martel attorney, said that the law in the area is still evolving. One unresolved question for LinkedIn account holders, he said, is determining if its policies only apply to users when they are logged in.
“Do the terms of service kind of tie your hands for all time?” he asked.
He mused whether “a creative plaintiff” was going to come up with an antitrust theory or unfair competition theory about why large social media platforms shouldn’t be able to cordon off sections of the internet and, essentially, make them private.
Thanks in part to quick settlements, LinkedIn hasn’t addressed, or been forced to address, such weighty issues just yet.
“There is no finish line in this fight to protect our members’ information,” Sarah Wight, a LinkedIn vice president of legal, posted after the company sued ProAPIs in October. “We will continue to build new safeguards to stop fake accounts and prevent scraping, and we will continue to take legal action when needed.”
The case is LinkedIn Corporation v. ProAPIs Inc, N.D. Cal., No. 3:25-cv-8393.
To contact the reporter on this story:
To contact the editors responsible for this story:
Learn more about Bloomberg Law or Log In to keep reading:
See Breaking News in Context
Bloomberg Law provides trusted coverage of current events enhanced with legal analysis.
Already a subscriber?
Log in to keep reading or access research tools and resources.