Interviews, insight & analysis on digital media & marketing

AI lawsuits risk overshadowing the positive applications for web scraping

Gathering web data at scale is a subject at the centre of various legal cases at the moment, with ongoing lawsuits against Google, Midjourney, OpenAI, and other tech giants. These legal battles have led people to question the legal status of web scraping and strengthened misconceptions surrounding this relatively new industry.

According to Oxylabs, this negative attention now risks overshadowing the benefits web scraping can bring to organsations and society at large.

Denas Grybauskas, Head of Legal at Oxylabs, commented: “Many have been quick to pounce on the negativity surrounding web data collection, clouding the good examples of its use. Gathering public web intelligence can benefit many projects, including investigative journalism and scientific research. For example, public data from social media sites and forums has been widely used in different sociology and psychology projects and even helped to predict COVID-19 outbreaks.”

“Web intelligence is used by travel fare aggregators and price comparison sites that help millions of people make better-informed decisions when shopping online. Web scraping is also vital for cybersecurity companies that monitor the activities of cybercriminals. It wouldn’t be an overstatement to say that without web intelligence, a lot of use cases we rely on daily would be impossible. However, as AI technology continues to evolve, consuming an ever-growing amount of public data, raising awareness about ethical web scraping has become especially important.”

To combat illegal data gathering, promote common standards, and share know-how about ethical practices, leading web intelligence organizations formed the Ethical Web Data Collection Initiative. The consortium aims to build trust around web data collection and educate industry players and the general public about its possibilities. Additionally, Oxylabs is spreading its expertise and ethical practices through such pro bono initiatives as Project 4β, which specifically targets universities and NGOs.

“Through 4β, we aim to transfer technological knowledge and support scientific research on big data”, Grybauskas added. “For example, we partnered with The Communications Regulatory Authority of the Republic of Lithuania to battle against child endangerment by deploying web scraping technology and AI-driven recognition tools that can detect harmful digital content units.”

According to Grybauskas, web scraping is a fresh industry, so it naturally has legal grey areas and can be tricky. Due to its complexity, it is often unfairly portrayed, missing the many benefits it brings.

“The most frequent mistake people make when scraping is failing to evaluate the nature of data they plan to fetch and adhere to the terms set. Legitimate scrapers focus on collecting public data that is open to everyone. However, even publicly available data can sometimes entail personal information or content subject to copyright laws. It is vital to encourage anyone gathering web data to consult legal practitioners before scraping.

“On the other hand, ongoing legal cases may bring more clarity to different aspects of online data gathering at scale, which would be beneficial not only to data-as-a-service companies and web intelligence providers but also to further AI research and development.”

News

More posts from News ->

New Digital Age, published by Bluestripe Group, covers the latest news, insight, opinion and research on all aspects of digital media and marketing. Our aim is to be a new voice for knowledge and inspiration about the companies, technologies and people powering the next wave of disruption in our industry. We’re a subsidiary of Bluestripe Group and will include news and views from our clients, in addition to other content we find interesting and that adds value to the digital media and marketing industry.

Please send all editorial enquiries to: editorial@newdigitalage.co.uk

For partnership enquiries please email us at: partnerships@bluestripegroup.co.uk

Contact us: newdigitalage@bluestripemedia.co.uk

Built by Jigowatt – Web Design Peterborough

Name*
First Last
Email*
Job Title*
What sector do you work in?*
CAPTCHA
Comments
This field is for validation purposes and should be left unchanged.

Interviews, insight & analysis on digital media & marketing

AI lawsuits risk overshadowing the positive applications for web scraping

News

Related articles

Follow us

Newsletter

Interviews, insight & analysis on digital media & marketing

AI lawsuits risk overshadowing the positive applications for web scraping

News

Merry Christmas, see you in 2025!

Specsavers launch new OOH creative for ‘Should’ve’ campaign

Ecomtent secures $1.4M to refresh AI-powered ecommerce search

Related articles

Ecomtent secures $1.4M to refresh AI-powered ecommerce search

The rise of AI and the need for authenticity

Mirakl acquires adtech company Adspert

Follow us

Newsletter