Web-Scraping and Data Mining Counseling
Web Scraping and Data Mining Counseling
Web scraping is the automated process of obtaining information from a website, computer, or server, culling through it to find relevant information, and copying that information while dumping the rest. It is a method that is used by thousands of companies with bots that crawl websites looking for all types of information.
Web-scraping, in and of itself, is not per se legal or illegal. Because web-scraping can involve many sorts of actions performed differently and for different reasons, it is impossible to simply say that it is either allowed or prohibited. Instead, each of the constituent activities which comprise web-scraping invokes its own legal issues, and those issues must be addressed piecemeal to determine whether the activity as a whole is legal.
Web-scraping typically involves copyright, trademark, trespass, breach of contract, and Digital Millennium Copyright Act (DMCA) issues. It can involve CAN-SPAM Act liability, Computer Fraud and Abuse Act (CFAA) problems, patent rights, and state law claims such as tortious interference or privacy-rights violations.
Copyright issues frequently arise because scraping generally involves some copying, reproducing, or displaying of original information. The penalties for copyright infringement can be stiff, and liability can extend from the individual who is directly infringing the copyright to the person who supplies products or services that make infringement possible. Vicarious and contributory theories of copyright infringement can impose liability on an entity even when that entity has not directly infringed. Trademark issues can arise information is being framed within a web page, presented with original logos, or used with old logos. Trespass and CFAA issues arise where there is cumulative or a large amount of traffic to a site, or where the used bandwidth is great.
Tom Galvani performs web-scraping analyses for clients with the goal of rendering an opinion whether the activity is allowed, prohibited, or in a gray area. He works with clients to develop solutions so that all laws are carefully followed, exposure to liability is minimized, and the rights of others’ are preserved.
The analysis is typically extensive, beginning with an investigation into how the scraping activity works and for what purpose. A detailed understanding of the software or bot’s operation is essential to be able to provide a rigorous and accurate analysis. The legal issues involved in web-scraping are extremely fact-dependent and can sway with minor differences. Contact Tom Galvani at 602-281-6481 to learn how to begin the process of determining whether a specific type of web-scraping activity risks liability.