Digital News Publishers Association (DNPA) has sent a letter and made representations to the ministries of electronics and information technology and information and broadcasting, seeking protection from likely copyright violations by AI models.
“Now that we know the positive opportunity and impact of generative AI and its implications on content creators and publishers…there is an opportunity to ensure that any company or LLM (large language model) uses data in a fair and transparent way while compensating the sources from where it takes the content or data to train its models,” Sujata Gupta, secretary general of DNPA, told ET.
This comes even as the New York Times and some Pulitzer-winning authors in the US have sued ChatGPT-maker OpenAI and Microsoft for copyright infringement for training AI on their proprietary works.
Also, Indian gen AI models such as Ola founder Bhavish Aggarwal-backed Krutrim AI and Sarvam AI’s OpenHathi have been launched in recent months.
Discover the stories of your interest
DNPA, which represents 17 top media publishers in the country including Times Group that publishes ET, is seeking amendments to the IT Rules until the Digital India Act – which is expected to replace the over 24-year-old IT Act, 2000, and regulate artificial intelligence – comes into force.
The association has been in discussions with the ministries for the past three months to highlight industry concerns, Gupta said.
The advent of Indian LLMs and the global ones definitely raises challenges and concerns similar to the New York Times here as it could affect publishers’ business models, she said.
What Legal Experts Say
The concept of ‘fair dealing’ in Indian copyright law, which is narrower than ‘fair use’ in the US, may not be adequate defence for gen AI companies here if they derive commercial gains from their LLM models, legal experts said.
“With the increase in use of AI and the large amount of data required to train an AI, the question of copyright infringement is bound to come up for decision before an Indian Court as well,” said JV Abhay, partner at law firm Shardul Amarchand Mangaldas & Co.
He noted that given that the Copyright Act, 1957 does not discuss the concept of training of AI explicitly, the present legislation may not be sufficient to deal with this issue.
The courts may need to reevaluate the concept of ‘originality’ in the context of AI-generated works, Abhay said.
Swati Sharma, partner and head – intellectual property at law firm Cyril Amarchand Mangaldas, said AI companies should follow a conscientious approach and application of ethics at the data feeding stage of LLM training regarding copyrightable material and give due credit to the original authors.
They can protect themselves from liability by being more careful about the data used and incorporating copyright notices and disclaimers, she said.
“Whether or not authors can enjoy royalties from what the AI companies earn by virtue of commercialisation of the AI is yet to be decided by courts in India,” Sharma said.
Abhay said AI companies could safeguard themselves if they are able to demonstrate originality in the datasets created for the purpose of training the AI or by separating the ‘creative expression’ from the copyrighted works and rely solely on the non-copyrightable ideas and information in the works.
Experts said copyright contracts could get more complex as owners may want to explicitly include provisions on whether their works can be used for AI model training and around royalties and license fees.
NYT Lawsuit Against OpenAI
In the US, the New York Times, in its December 27 lawsuit filed in federal district court in Manhattan, said millions of its articles were used illegally to train Microsoft-backed OpenAI’s AI chatbots that now compete with the news outlet as a source of reliable information.
Though it did not seek monetary compensation, NYT said the companies should be held accountable for “billions of dollars in statutory and actual damages”, according to the lawsuit for unlawful copying of its uniquely valuable works.
It demanded that the companies destroy any chatbot models and training data that made use of their copyrighted materials. It added that it had approached OpenAI with its concerns in April to seek a commercial agreement or incorporation of ‘technological guardrails’ in gen AI products. But these did not come to fruition.
OpenAI in a January 8 blog post said the NYT lawsuit is “without merit” and that it supports journalism with the “transformative potential” of AI. It argued that training AI models using publicly available internet materials is “fair use” under US copyright law.
The ChatGPT-maker had in November said it would cover its customers’ legal costs were they to face any copyright infringement lawsuits.