Intellectual Property and Personal Data in AI Datasets Under India's DPDP Act 2023
DOI:
https://doi.org/10.69971/tipr.4.2.2026.109Keywords:
AI training data, DPDPA compliance, copyright fair dealing, trade secrets, DPIIT framework, data ownership IndiaAbstract
The rapid expansion of generative AI has challenged India’s fragmented legal regime governing AI training data, spanning the Copyright Act, 1957, trade secret protection, and the Digital Personal Data Protection Act, 2023 (DPDPA). This study explores the doctrinal incompatibility between India’s purpose-specific fair dealing framework under Section 52 and the industrial-scale reproduction intrinsic to AI training, which fails the jurisdictional “purpose test” articulated in Super Cassettes Industries Ltd v. Hamar Television Network (2011). The study exposes the structural inadequacy of trade secret law in protecting the “composited value” of large-scale aggregated datasets, which lack the identifiability and durability required for conventional protection. The DPDPA’s consent-centric architecture is functionally unworkable in billion-token training corpora characterized by attenuated data-principal relationships. Concrete doctrinal fault lines, including uncertainty surrounding “reproduction in material form” under Section 14(a)(i), transparency-trade secret conflicts identified in the DPIIT Working Paper on Generative AI and Copyright, and cross-border transfer constraints under Section 17 of the DPDPA have been mapped. Legal uncertainty will undermine both AI innovation and stakeholder protection in India if an integrated statutory framework for permissible training practices and rights allocation is not opted.
Downloads
References
Anonymous. 2021. Invoking trade secrets to block a request to access personal data under the GDPR: A “Threat” Has to Be Clearly Demonstrated. https://www.crowelltradesecretstrends.com/2021/04/invoking-trade-secrets-to-block-a-request-to-access-personal-data-under-the-gdpr/
Anonymous. 2023. AI and the Law of Copyright in India. https://spiceroutelegal.com/publications/ai-and-the-law-of-copyright/3/
Anonymous. 2024. Exploring the DPIIT's Working Paper on Generative AI and Copyright. https://www.ikigailaw.com/article/655/exploring-the-dpiits-working-paper-on-generative-ai-and-copyrigh
Anonymous. 2025. How Will the DPDPA Impact AI? https://www.dpdpconsultants.com/blog.php?id=38&title=how-will-the-dpdpa-impact-ai
Buick, Adam. 2025. Copyright and AI Training Data—Transparency to the Rescue? Journal of Intellectual Property Law & Practice 20: 182-192. https://doi.org/10.1093/jiplp/jpae102
Chimni, Arzu, and Vrinda Patodia. 2024. The DPIIT Working Paper on AI and Copyright: Regulatory signals and practical implica-tions.https://www.obhanandassociates.com/blog/the-dpiit-working-paper-on-ai-and-copyright-regulatory-signals-and-practical-implications/
Gupta, Gaurav, and Nancy Roy. 2025. Text and data mining vs India's Digital Personal Data Protection Act, 2023: a critical study of the legal land-scape.https://bridgecounsels.com/text-and-data-mining-vs-indias-digital-personal-data-protection-act-2023-a-critical-study-of-the-lega/
Sarthak, K. 2025. Copyright protection in LLM AI training – Part 2. https://www.khuranaandkhurana.com/2025/01/27/copyright-protection-in-llm-ai-training-part-2/
Kaplan, Jared, and Samuel McCandlish. 2020. Scaling laws for neural language models. Journal of Machine Learning Research 21: 1–30. https://arxiv.org/abs/2001.08361
Kemp, Richard. 2020. Algo IP: Intellectual property in AI datasets, insights and outputs – the growing importance of trade secrets. Kemp IT Law. https://kempitlaw.com/insights/algo-ip-intellectual-property-in-ai-datasets-insights-and-outputs-the-growing-importance-of-trade
Krimmelbein, Fred. 2024. Data ownership in the age of AI: the impact of data governance. https://labs.sogeti.com/data-ownership-in-the-age-of-ai-the-impact-of-data-governance/
Kupferschmid, Keith. 2024. Requiring AI transparency won't destroy the trade secrets of AI Companies. https://copyrightalliance.org/ai-transparency/
Latham and Watkins. 2023. India's digital personal data protection Act 2023 vs. the GDPR: a comparison. https://www.lw.com/admin/upload/SiteAttachments/Indias-Digital-Personal-Data-Protection-Act-2023-vs-the-GDPR-A-Comparison.pdf
Liu, Shuimei, and L. Raymond Guo. 2024. Data Ownership in the AI-Powered Integrative Health Care Landscape. JMIR Medical Informatics 12. https://doi.org/10.2196/57754
Mathur, Arnav, and Ananya Popli. 2024. Trade, privacy and DPDPA: crafting India's response to the privacy-trade dilemma. https://nliulawreview.nliu.ac.in/blog/trade-privacy-and-dpdpa-crafting-indias-response-to-the-privacy-trade-dilemma/
Organization for Economic Co-operation and Development. 2025. Intellectual property issues in artificial intelligence trained on scraped data. https://www.oecd.org/en/publications/intellectual-property-issues-in-artificial -intelligence-trained-on-scraped-data_d5241a23-en.html
Priyadarshi, Shubhi. 2026. Fair dealing cannot be presumed: why AI Training fails the purpose test under indian copyright law. https://www.barandbench.com/columns/fair-dealing-cannot-be-presumed-why-ai-training-fails-the-purpose-test-under-indian-copyright-law
Sengupta, Pamela. 2024. Data ownership and privacy in the age of generative AI. https://www.ve3.global/data-ownership-and-privacy-in-the-age-of-generative-ai/
Shekar, Shaurya. 2025. Training AI, testing law: India's copyright challenge with TDM. https://lawschoolpolicyreview.com/2025/08/08/training-ai-testing-law-indias-copyright-challenge-with-tdm/
Singh, Satyam. 2025. Double-edged data: the trade secret dilemma in India's DPDP act. https://www.mondaq.com/india/trade-secrets/1643096/double-edged-data-the-trade-secret-dilemma-in-indias-dpdp-act
Ugochukwu, Albert I., and Peter W. B. Phillips. 2024. Open data ownership and sharing: challenges and opportunities for application of FAIR principles and a checklist for data managers. Journal of Agriculture and Food Research 16: 1-9. https://doi.org/10.1016/j.jafr.2024.101157
Wolff, Yves-Alexander. 2025. AI training data sets as intellectual property? protectability and protection gaps of training data and data sets. https://buse.de/en/blog-en/technology/ai-training-data/
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.