Recent data scraping incidents at Facebook Inc. and LinkedIn Corp. highlight an ongoing debate over whether companies can invoke an anti-hacking law to restrict rivals or other actors from harvesting information from people’s online profiles.
The issue could reach the U.S. Supreme Court, in a case over a data-scraping dispute between LinkedIn and workforce analytics startup hiQ Labs Inc. The court is being asked to review hiQ’s ability to gather user information from the job search site, testing the applicability of the Computer Fraud and Abuse Act to data that’s publicly available online.
It’s unclear whether the practice is considered illegal under the decades-old law that since has been applied to scenarios beyond hacking. Regulators in the U.S. and Europe are also grappling with how to hold companies accountable for protecting personal information from being scraped and then leaked out.
The LinkedIn-hiQ case gets at a tension for privacy in the digital age, as more widely shared consumer data becomes more valuable to advertisers or analysts like hiQ, according to Thomas Kadri, a law professor at the University of Georgia who researches cybercrime and privacy.
“We’ve in many ways thought of privacy as keeping things hidden,” Kadri said. “But it’s not really about secrecy because a lot of the data that has value in the digital age is publicly accessible.”
Data scraping involves gathering information in an automated way on a mass scale. The same technology that allows for personal profiles to be scooped up also underlies Google searches, airline ticket trackers, and other kinds of research.
Scraping can lead to a leak of information when data collected is put up for sale or otherwise posted online. Facebook and LinkedIn face probes from data protection authorities in the European Union after recent reports of leaked data on about 500 million users that had been scraped off each company’s platform.
The EU’s General Data Protection Regulation sets obligations for companies like Facebook and LinkedIn that handle personal information and allows for hefty fines for violations. Web scrapers that collect Europeans’ personal data are also considered subject to the regime.
In the U.S., the Facebook leak could also bring scrutiny from the Federal Trade Commission.
Companies like LinkedIn that have been subject to scraping have taken legal action against it, but it remains to be seen whether they can pursue scrapers in court under the Computer Fraud and Abuse Act or other legal avenues like breach of contract. The Supreme Court hasn’t decided yet whether to take up the LinkedIn-hiQ case. LinkedIn petitioned for the justices’ review in March 2020.
“It’s not at all clear what is or isn’t legal in the world of data scraping,” said Alan Butler, executive director of the nonprofit Electronic Privacy Information Center.
LinkedIn is seeking to overturn a ruling by the U.S. Court of Appeals for the Ninth Circuit that found the anti-hacking law can’t be used against scraping of publicly available data.
The ruling clashed with earlier court decisions that were more receptive to company attempts to block scrapers, including one involving Southwest Airlines Co. and Roundpipe LLC, a company that used scraping to notify consumers if airline ticket prices lowered after they made a purchase. Roundpipe was barred from scraping airfares from Southwest’s site as part of an agreement to settle the suit.
Legal thinking is evolving as tech companies have become more dependent on their control over data, according to Kieran McCarthy, who represents startups, tech companies, and other businesses as founder of McCarthy Garber Law. At issue in the LinkedIn case is an industry incumbent’s ability to ward off a would-be rival, McCarthy said.
“Lowering access barriers to big data resources for non-incumbents—that’s a lot of what the hiQ case is about,” he said.
Facebook recently sued BrandTotal Ltd. and Unimania Inc. for violating its terms of service by using browser extensions to collect data on social media users to sell so-called “marketing intelligence.” The social media platform has brought similar legal action against other scrapers, including a network of clone sites.
In February 2020, Facebook also sent a cease-and-desist letter to Clearview AI Inc., a company that developed facial recognition technology using billions of photos scraped from social media.
Facebook, which uses technology to detect unauthorized scraping, says its work in this area is part of efforts to protect people’s privacy. LinkedIn has similarly argued in its petition to the Supreme Court that scraping violates its pledge to protect user privacy.
“That’s one of the points LinkedIn is making in its case,” said Jeffrey Neuburger, a partner at Proskauer Rose LLP who focuses on technology, media, and intellectual property. “If scraping isn’t stopped, privacy isn’t protected.”
Neuburger added that the problem with this kind of privacy argument is that hiQ is said to have scraped data that LinkedIn users made public. HiQ also claims that LinkedIn sought to block the startup’s data access around the same time the company announced a similar analytics service offering recruiting and retention insights.
HiQ’s allegation is “untrue,” according to a LinkedIn spokeswoman.
“This case is about protecting our members’ ability to control the information they make available on LinkedIn,” the spokeswoman said. “When third parties scrape personal data without member knowledge or consent and use it for purposes our members didn’t agree to, we take action.”
HiQ didn’t respond to a request for comment.
Academics and journalists have likewise used scraping to study trends and patterns on Facebook. New York University researchers say the company pressured them to shut down a tool for scraping data about political ad-targeting on its platform for violating Facebook’s terms.
First Amendment lawyers have urged Facebook to provide a so-called safe harbor for such research. The safe harbor, proposed by the Knight First Amendment Institute at Columbia University, would shield researchers and journalists from the threat of legal action, as long as their work is in the public interest and user privacy is protected.
“We believe strongly that Facebook shouldn’t be the gatekeeper of information about it,” said Ramya Krishnan, a staff attorney at the Knight Institute and a lecturer at Columbia Law School.
A Facebook spokesman didn’t comment on the Knight Institute’s proposal. Earlier this year, the platform shared with researchers information on how political ads are targeted.
“We’re committed to providing privacy-safe ways for researchers, academics and journalists to conduct research on our platform,” the Facebook spokesman said.