NEWS

Big Tech companies score early wins in court battles over the use of copyrighted content to train their AI

JOSÉ M. RODRÍGUEZ SILVA

Updated 06/28/2025 - 06:24 ET

Anthropic and Meta have defeated several writers in a series of trials in the United States with the potential to transform the industry

This week, Meta and Anthropic have secured two victories that could be very significant for the future of artificial intelligence. Both tech companies have been acquitted in two separate lawsuits filed by a group of authors seeking compensation for the use of their works to train the language models of artificial intelligence in these companies.

The rulings are relevant for what they could mean for copyrighted content that has been systematically used without permission or compensation to enable the phenomenon of generative AI.

The Meta case is significant, as leaked emails from the company showed that they had downloaded thousands of books via BitTorrent to train their Llama models. In this instance, the San Francisco court handling the case was analyzing a similar situation, as the group used the LibGen search engine, a tool that provides access to copyrighted books and scientific articles. Among the complainants was the writer Ta-Nehisi Coates, author of Between the World and Me.

The judge, however, has not been definitive and has indicated that the ruling partly stems from the fact that the accusation did not present its case well, as reported by the Financial Times.

Prior to this, also in San Francisco, Anthropic achieved a more significant legal victory as it occurred in a federal court, where the tech company did not acknowledge the piracy that Meta engaged in but argued that, since they had purchased the books and scanned them for later use, they could make fair use of this content.

This is the justification being followed by tech companies in the dozens of lawsuits they have open throughout the United States, with the dispute between The New York Times and OpenAI being the most high-profile case. It is the same argument used, for example, when a journalist or author quotes a passage from another book to illustrate a point of view.

After entering the market, tech companies have somewhat moderated their tactics in using third-party content (in part because almost the entire web has already been used to train artificial intelligences) and have opted, for example, to sign agreements with major media outlets to access their corpus and content.

Another approach taken by social media companies like Meta or X is to notify users that they will use their posts as a new data source for their future AI systems. Towards the end of the year, YouTube created an option to allow or disallow creators to permit AI companies to use their videos for these purposes.

In Spain, the Ministry of Culture attempted to activate a third way, known as extended collective licenses. This allowed copyright management societies to authorize platforms to use the material they manage in exchange for payment to these organizations, without having to seek individual author approval. The project was launched last year but has been stalled due to opposition from certain sectors of the cultural industry.