Common Crawl has ceased crawling Danish new media’s websites

Common Crawl has ceased crawling Danish new media’s websites thanks to the Danish Rights Alliance and a great co-operation between the Alliance and Danske Medier.

Common Crawl has scraped the websites without consent or compensation. This has now ended, and the news media have gained some control over their own content. There is still a lot of work to be done as datasets derived from Common Crawl’s web scrapes have been used by Google, Meta, and OpenAI.

Big tech must stop taking news media content via a back door. Big tech must use the front door and negotiate with content creators and collective management organisations as DPCMO directly. There is no excuse for their mass exploitation.

It’s time for a transparent and fair approach, where big tech companies engage directly with creators and negotiate terms that are mutually beneficial. Instead of taking advantage of legal loopholes or using content without proper agreements.

It is cool to respect intellectual property rights, also in the AI age. Especially if your corporate mission is to benefit all of humanity.