The 9th Circuit Weighs in on the Legality of Data Mining Algorithms
Data Mining versus Hacking
Data scraping is a popular form of coding. When programmers write algorithms, they want to base their processes on concrete data available online. This data can be accessed through data scraping, in which a programmer writes a code taught to enter a company’s website, search for, and obtain certain information.
While data scraping does, to an extent, sound like hacking, the two are inherently different. Hacking deploys code meant to break firewall barriers and access information that is meant to be private. Typically hacking is done maliciously, with the intent to gain sensitive information and hold that information for a ransom. Hacking attempts like these are called ransomware attacks. For instance, if an outside individual were to create code that would compromise the digital protections surrounding information, allowing the individual to access protected information, and then hold that at a ransom, it would be considered a hack.
Data scraping differs from hacking in a few ways. Chiefly, the goal of data scraping isn’t malicious–it’s a way to improve algorithms. In the information age, it’s important to quantify data. These algorithms can provide analysis of complex phenomena and provide solutions to nuanced problems. These algorithms can also differ in technique. Data scraping is only used to view publicly accessible information on company websites. The point of data scraping, usually, is to improve algorithms that utilize data to make systems and analyses better
The Conundrum for Business Owners
However, could a business stop data scraping algorithms from entering their websites and gathering public information as a matter of privacy? Sure, maybe public information isn’t necessarily bad to divulge: but what if a complex algorithm can take large amounts of innocent data and paint a picture that isn’t desirable, or reveals a truth that you’d rather keep hidden? This is the issue that LinkedIn recently faced when hiQ Labs Inc, a data science company, deployed data scraping algorithms on LinkedIn’s website to harness more data regarding employment.
LinkedIn, Data Science, the 9th Circuit, and the Computer Fraud and Abuse Act
LinkedIn sued hiQ Labs Inc, claiming that the data scraping algorithms violated the Computer Fraud and Abuse Act (CFAA), which has operated as the main standard for scrutinizing hacking attempts. The Act’s “exceeds authorized access” passage was the claim LinkedIn levied against hiQ, asserting that by deploying data scraping algorithms onto LinkedIn's site, hiQ Labs was exceeding what should be understood as "authorized access" to information held by LinkedIn.
The case made its way to the 9th Circuit. The 9th circuit determined, based on the United States Supreme Court’s reasoning in Van Buren last year, that the CFAA’s “exceeds authorized access” passage does not protect data scraping algorithms obtaining publicly available information.
Conclusion
As a business, you can use the Computer Fraud and Abuse Act as a protection against hackers who, as unauthorized users, access private information. But, at least for now, algorithms with data scraping capabilities accessing public information remain legal.