WASHINGTON — The New York Times sued ChatGPT-maker OpenAI and Microsoft in a US court on recently, alleging that the companies' powerful AI models used millions of articles for training without permission.
Through their AI chatbots, the companies "seek to free-ride on The Times' massive investment in its journalism by using it to build substitutive products without permission or payment", the lawsuit said.
Copyright is becoming a major battleground for the much-hyped generative AI sector, with publishers, musicians and artists increasingly lawyering up to get paid for technology that is being built with their content.
With the suit, The New York Times also chose the more confrontational response to the sudden rise of AI chatbots, in contrast to other media groups such as Germany's Axel Springer or the Associated Press that have entered content deals with OpenAI.
"If The Times and other news organisations cannot produce and protect their independent journalism, there will be a vacuum that no computer or artificial intelligence can fill," said the Times' complaint.
"Less journalism will be produced, and the cost to society will be enormous."
The Times, one of the most respected news organisations in the United States, is seeking damages, as well as an order that the companies stop using its content for the training of AI models — and destroy data already harvested.
While no sum is specifically requested, the Times alleges that the infringement could have cost "billions of dollars in statutory and actual damages".
Microsoft, the world's second biggest company by market capitalisation, is a major investor in OpenAI, and swiftly implemented the powers of AI in its own products after the release of ChatGPT last year.
The AI models that power ChatGPT and Microsoft's Copilot (formerly Bing) were trained for years on content available on the Internet, under the assumption that it was fair to be used without compensation.
But the lawsuit, filed in a federal court in New York, argued that the use of the Times' work was unlawful notably because the new products created a potential rival to news publishers.
OpenAI said it was "surprised and disappointed" by the lawsuit given that it was in talks with the Times over the issue that were "moving forward constructively".
"We're hopeful that we will find a mutually beneficial way to work together, as we are doing with many other publishers," an OpenAI spokesperson added.
Not 'transformative'
But the Times said that in its attempts to seal a content deal with OpenAI, it was told that the technology was "transformative" and therefore did not require a commercial arrangement.
"There is nothing 'transformative' about using The Times' content without payment to create products that substitute for The Times and steal audiences away from it," the lawsuit alleged.
The lawsuit also said that content generated by ChatGPT and Copilot closely mimicked New York Times style, at times falsely citing the paper as a source, and that its output was given a privileged status by OpenAI because of its reliability.
The emerging AI giants are facing a wave of lawsuits over their use of internet content to build their AI systems that create content on simple prompts.
Last year, "Game of Thrones" author George RR Martin and other best-selling fiction writers filed a class-action lawsuit against OpenAI, accusing the startup of violating their copyrights to fuel ChatGPT.
Universal and other music publishers have sued AI company Anthropic in a US court for using copyrighted lyrics to train its systems and in generating answers to user queries.
US photo distributor Getty Images has accused Stability AI of profiting from its pictures and those of its partners in order to make visual AI that creates original images.
Hundreds of news publishers meanwhile have used programming code to block OpenAI, Google and others from scanning their websites for training data.
With lawsuits piling up, Microsoft and AI player Google have announced they would cover the legal fees of corporate customers sued for copyright infringement over content generated by their AI.