Find out more about Lexology or get in touch by visiting our About page.
The globally contested legal battle on training large language models ("LLM") using copyrighted content recently reached India, with Asian News International ("ANI"), an Indian news agency, filing a copyright infringement suit in the Delhi High Court (the "Court") against Open AI, an artificial intelligence ("AI") research organization, alleging unauthorized usage of its publicly available copyrighted content to train Open AI's chatbots. ANI has sought INR20 million (~ US$ 236,053) in damages and an interim injunction to restrain Open AI from storing, publishing, reproducing, or using ANI's copyrighted works.
This article discusses the case's prospective outlook and examines the issue of copyright infringement by generative AI in India.
Background
ANI has alleged that Open AI has: (i) used ANI's publicly available copyrighted content without authorization; (ii) generated responses that are verbatim or substantially the same as ANI's copyrighted content; and (iii) generated hallucinated responses by falsely attributing fabricated interviews or news stories to ANI. In response to the foregoing allegations, Open AI has contested that copyright law only protects the expression of ideas and not facts or the idea itself. Open AI has also questioned the jurisdiction of the Indian courts as its servers are located outside India and no training of LLMs has taken place in India. Further, Open AI has informed the Court that it has blocked ANI's domain and is no longer using ANI's material for training its LLM. The Court has refused the interim injunction sought by ANI but has directed that summons be issued to Open AI to appear and present its case at the next hearing scheduled on January 28, 2025.
While Open AI has based its challenge on jurisdiction grounds, the core legal issue for the Court to decide, and for all technology and AI lawyers to keep track of, will be whether the unauthorized usage of copyrighted content by AI companies to train their LLMs can be construed as "fair dealing" and, therefore, be excepted from infringement claims. The concept of fair dealing is similar to the "fair use" doctrine in the US. In cases filed against Open AI in the US, Open AI has taken the defense of "fair use" and "transformative use," contending that AI outputs are not exact reproductions of original works and often add new meaning or expression to the work product.
Our comments
The issue of copyright infringement in training LLMs is being contested in jurisdictions worldwide. Open AI has adopted a dual strategy, i.e., while maintaining that its usage of copyrighted materials for training purposes meets the exception of "fair use," it has also simultaneously started entering into license agreements with prominent news publishers to get access to their content. Separately and notably, Open AI has introduced an opt-out mechanism as per which a publisher can fill a form to opt out, following which, Open AI will stop using the concerned publisher's material to train its chatbot.
From an Indian standpoint, where the AI LLM industry is still in its nascency, Indian courts will have to carefully balance out innovation versus copyright protection. If the Court passes a verdict favoring Open AI, news publishers in India will face challenges in the long run and run the risk of user traffic getting diverted to Open AI from their news portals. On the flip side, if the Court holds that ANI's copyright has been infringed, it can pose an issue for the budding Indian AI industry, which is trying hard to compete with well-funded giants like Open AI, who have already accessed and trained their LLMs on swathes of information. Such small AI developers will then need to negotiate licensing agreements with numerous copyright holders, which can be a very costly exercise.