Copyright lawsuits against OpenAI are really about who owns the language we use

Read Editorial Disclaimer

Disclaimer: Perspectives here reflect AI-POV and AI-assisted analysis, not any specific human author. Read full disclaimer — issues: report@theaipov.news

By Tech Desk | March 16, 2026 | 5 min read AI-Assisted | Source: techcrunch.com

The fight over whether ChatGPT can recite a dictionary definition is not a narrow licensing spat. It is a referendum on whether the language we all use every day—the definitions, the encyclopedic facts, the phrasing that reference publishers have curated for decades—belongs to anyone at all, or to the companies that swept it into training data without asking. When Merriam-Webster and Encyclopedia Britannica sued OpenAI in March 2026 in New York federal court, they did not only allege that nearly 100,000 of their articles had been copied to train ChatGPT. They exposed how AI builders have treated the written commons as free fuel while locking their own outputs behind terms that forbid anyone from doing the same to them.

Copyright lawsuits against OpenAI are really about who owns the language we use

According to the complaint filed in the Southern District of New York (Case 1:26-cv-02097) on 13 March 2026, OpenAI used Merriam-Webster and Britannica content to train its language models without permission or payment. The plaintiffs argue that ChatGPT produces verbatim or near-verbatim reproductions of definitions and encyclopedia entries, and that the system cannibalises traffic to their sites by answering queries that would otherwise send users to the publishers. As techcrunch.com reported, the dispute centres on almost 100,000 articles the plaintiffs say were used for training. OpenAI has responded that its models are trained on publicly available data and that their use is grounded in fair use—a defence that is under growing pressure in courts elsewhere.

The written commons were treated as free fuel

Britannica had attempted to negotiate licensing with OpenAI as early as November 2024, according to reporting on the case. Those overtures were rejected while OpenAI signed licensing deals with other publishers, creating a pattern where some rightsholders are paid and others are not. That asymmetry is at the heart of the "written commons" argument: the same language and reference material that schools, writers, and the public have relied on are now embedded inside a commercial product, with no cut for the institutions that compiled and maintained it. The complaint also includes trademark claims, accusing OpenAI of falsely attributing errors or incomplete answers to the publishers when the model hallucinates.

Fair use is no longer a safe haven for training

Legal precedent is shifting. In February 2025, a Delaware court in Thomson Reuters v. Ross Intelligence reversed an earlier ruling and held that using copyrighted material to train an AI system can constitute direct copyright infringement, rejecting the defendant's fair use defence. Ropes & Gray and other analysts have noted that this casts doubt on whether fair use will reliably shield AI companies from liability for training on copyrighted works. At the same time, the UK government was due to deliver an economic impact assessment by 18 March 2026 on proposed copyright changes that could allow AI firms to use protected work without permission unless owners opt out—a move that drew protests from thousands of authors who published a symbolic "Don't Steal This Book" in March 2026, as reported by The Guardian.

Expert commentary has sharpened. The Copyright Alliance and others have criticised some 2026 rulings that favoured AI companies for applying "woefully superficial" fair-use analysis, concluding that use is transformative simply because generative AI is new technology rather than applying the legal standard from Campbell v. Acuff-Rose. IP Watchdog reported in February 2026 that litigation is increasingly paving the way to licensing: large publishers such as News Corp have secured deals with OpenAI worth hundreds of millions of dollars, while smaller and reference publishers often lack the leverage to negotiate. The Merriam-Webster and Britannica suit fits that pattern: reference works are part of the shared linguistic and factual infrastructure, yet they were used without a licence. The Bartz v. Anthropic settlement in September 2025—roughly $1.5 billion after a court held that training on pirated books was not fair use—shows that courts are willing to attach serious financial consequences to how training data is sourced.

What This Actually Means

The Merriam-Webster and Britannica case is not just about two reference brands. It is about who gets to monetise the shared infrastructure of language and fact. If courts side with OpenAI on a broad fair-use theory, reference publishers and other small rightsholders will have little leverage; if they side with the plaintiffs, the cost and structure of AI training will change. Either way, the suit makes visible what was long implicit: the industry built on "publicly available data" has been feeding on works that were public in the sense of being readable, not in the sense of being free for commercial ingestion. The question of who owns the language we use is now squarely in front of the courts.

What is the lawsuit about?

Encyclopedia Britannica, Inc. and Merriam-Webster, Inc. sued OpenAI and related entities in the U.S. District Court for the Southern District of New York on 13 March 2026. The 44-page complaint alleges copyright infringement and trademark violations. The plaintiffs claim that OpenAI used close to 100,000 of their online articles to train ChatGPT without authorisation or payment, and that the model outputs verbatim or near-verbatim copies of their content. They also allege that ChatGPT diverts users who would otherwise visit the publishers' sites, and that OpenAI has misattributed inaccurate or incomplete outputs to them. OpenAI disputes the claims and asserts that training on publicly available data is protected by fair use.

Who are the plaintiffs?

Encyclopedia Britannica, Inc. publishes the Encyclopaedia Britannica and related reference products. Merriam-Webster, Inc. is the oldest dictionary publisher in the United States and publishes Merriam-Webster dictionaries. Both are represented by Susman Godfrey L.L.P. in the case. According to court filings, Merriam-Webster's corporate parent is Aletheia Holdings, LP; the same parent is identified for Encyclopedia Britannica, Inc. The case is docketed as 1:26-cv-02097 and has been filed as related to a larger multi-district litigation (1:25-md-03143) concerning OpenAI and copyright.

Sources

techcrunch.com, Pacer Monitor – Encyclopedia Britannica et al v. OpenAI, Ropes & Gray – AI training and copyright, The Guardian – Authors protest AI use of works, IP Watchdog – AI copyright and licensing

Related Video

Related video — Watch on YouTube

Read More News

How To Build A Legal RAG App In Weaviate

AI YouTube Clones Are Turning Professor Jiang’s Viral Rise Into A Conspiracy Machine

The Iran Ceasefire Is Turning Into A Maritime Pressure Campaign

China’s Taiwan Carrot Still Depends On Military Pressure

Putin’s Easter Ceasefire Shows Why Russia Still Controls The Timing

OpenAI’s Cyber Defense Push Shows GPT-5.4 Is Arriving With Guardrails

Meta’s Muse Spark Makes Subagents The New Face Of Meta AI

Your Fingerprints Are Now Europe’s First Gatekeeper: How a Digital Border Quietly Seized Unprecedented Control

Meloni’s Crime Wave Panic: A January Stabbing Becomes April’s Political Opportunity

Germany’s Noon Price Cap Is Economic Surrender Dressed as Policy Innovation

Germany’s Quiet Healthcare Revolution: How Free Lung Cancer Screening Reveals What’s Really Broken

France’s Buried Confession: Why Naming America as an Election Threat Really Means

The State as Digital Parent: Why the UK’s Teen Social Media Ban Is Actually Totalitarian

Starmer’s Crypto Ban Is Political Theater Hiding a Completely Different Story

Spain’s €5 Billion Emergency Response Will Delay Economic Pain, Not Prevent It

The Spanish Soldier Detention Reveals the EU’s Fractured Israel Strategy

Anthropic’s Mythos Reveals the Truth: AI Labs Now Possess Models That Exceed Human Capability

Polymarket’s Pattern of Suspiciously Timed Bets Reveals Systemic Information Asymmetry

Beyond Nostalgia: How Japan’s Article 9 Debate Reveals a Civilization Under Existential Pressure

Japan’s Oil Panic Exposes the Myth of Wealthy Nation Invulnerability

Brazil’s 2026 Rematch: The Election That Will Determine If Latin America Surrenders to the Left

Brazil’s Lithium Trap: How the Energy Transition Boom Could Destroy the Region’s Future

Australia’s Iran Refusal: A Sovereign Challenge to American Hegemony That Will Cost It Dearly

Artemis II’s Historic Return: The Moon Mission That Should Be Celebrated but Reveals Space’s True Purpose

Why the Netherlands’ Tesla FSD Approval Is a Regulatory Trap for Europe

The Dutch Government’s Shareholder Revolt Could Reshape Executive Compensation Across Europe

Poland’s Economic Success Cannot Prevent the Rise of Polexit and European Fragmentation

The Poland-South Korea Defense Partnership Is Quietly Reshaping European Security Architecture

North Korea’s Missile Tests Are Reactive—The Real Escalation Is Seoul’s Preemption Strategy

Samsung’s Record Earnings Are Real, But the Profits Vanish When You Understand the Costs

Turkey’s Radical Tobacco Ban Could Kill an Industry—But First It Will Consolidate Power

Turkey’s Balancing Act Is Breaking: Fitch Downgrade Reveals Currency Collapse Risk

Milei’s Libertarian Experiment Is Unraveling: Approval Hits Historic Low

Mexico’s Last Fossil Fuel Bet: Saguaro LNG Would Transform Mexico’s Energy Future—If It Survives Politics

Mexico’s World Cup Dream Meets Security Nightmare: 100,000 Troops Cannot Prevent Cartel War Bloodshed