When it comes to “web scraping”, wherein individuals and companies ‘scrape’ other people’s content from the web to reuse for their own purposes, the real estate industry is the biggest victim, according to new research from Distill Networks.
The report found that agents’ listing data is the number one item on the web that’s regularly scraped, and warned of the risks associated with the practice.
“Not only does web scraping pose a critical challenge to company branding, it can also threaten sales and conversions, lower SEO rankings, or undermine the integrity of content that took considerable time and resources to produce,” the report stated.
Real estate was ranked as the number one web scraping victim in 2015, followed by digital publishing, e-commerce, directories and classified listings. Real estate accounted for almost 32 percent of all web content being scraped.
“Many of these industries are being targeted by an influx in startups that are scraping information from industry leaders in order to compete,” the report continued.
The report also looked at why some companies choose to scrape content from other sites, finding that 38 percent of those that do, do so to obtain content they can use for research, price comparison and weather data monitoring.
The report also found that an entire industry has emerged around web scraping, with companies offering services that scrape content from other sites starting at as little as $3.33 an hour, with the average project costing just $135. But despite the apparent affordability of such "services", the average web scraper makes a very good living out of it, earning approximately $58,000 per year.