Skip to content

Copyright lawsuits against OpenAI are really about who owns the language we use

Read Editorial Disclaimer
Disclaimer: Perspectives here reflect AI-POV and AI-assisted analysis, not any specific human author. Read full disclaimer — issues: report@theaipov.news

The fight over whether ChatGPT can recite a dictionary definition is not a narrow licensing spat. It is a referendum on whether the language we all use every day—the definitions, the encyclopedic facts, the phrasing that reference publishers have curated for decades—belongs to anyone at all, or to the companies that swept it into training data without asking. When Merriam-Webster and Encyclopedia Britannica sued OpenAI in March 2026 in New York federal court, they did not only allege that nearly 100,000 of their articles had been copied to train ChatGPT. They exposed how AI builders have treated the written commons as free fuel while locking their own outputs behind terms that forbid anyone from doing the same to them.

Copyright lawsuits against OpenAI are really about who owns the language we use

According to the complaint filed in the Southern District of New York (Case 1:26-cv-02097) on 13 March 2026, OpenAI used Merriam-Webster and Britannica content to train its language models without permission or payment. The plaintiffs argue that ChatGPT produces verbatim or near-verbatim reproductions of definitions and encyclopedia entries, and that the system cannibalises traffic to their sites by answering queries that would otherwise send users to the publishers. As techcrunch.com reported, the dispute centres on almost 100,000 articles the plaintiffs say were used for training. OpenAI has responded that its models are trained on publicly available data and that their use is grounded in fair use—a defence that is under growing pressure in courts elsewhere.

The written commons were treated as free fuel

Britannica had attempted to negotiate licensing with OpenAI as early as November 2024, according to reporting on the case. Those overtures were rejected while OpenAI signed licensing deals with other publishers, creating a pattern where some rightsholders are paid and others are not. That asymmetry is at the heart of the "written commons" argument: the same language and reference material that schools, writers, and the public have relied on are now embedded inside a commercial product, with no cut for the institutions that compiled and maintained it. The complaint also includes trademark claims, accusing OpenAI of falsely attributing errors or incomplete answers to the publishers when the model hallucinates.

Fair use is no longer a safe haven for training

Legal precedent is shifting. In February 2025, a Delaware court in Thomson Reuters v. Ross Intelligence reversed an earlier ruling and held that using copyrighted material to train an AI system can constitute direct copyright infringement, rejecting the defendant's fair use defence. Ropes & Gray and other analysts have noted that this casts doubt on whether fair use will reliably shield AI companies from liability for training on copyrighted works. At the same time, the UK government was due to deliver an economic impact assessment by 18 March 2026 on proposed copyright changes that could allow AI firms to use protected work without permission unless owners opt out—a move that drew protests from thousands of authors who published a symbolic "Don't Steal This Book" in March 2026, as reported by The Guardian.

Expert commentary has sharpened. The Copyright Alliance and others have criticised some 2026 rulings that favoured AI companies for applying "woefully superficial" fair-use analysis, concluding that use is transformative simply because generative AI is new technology rather than applying the legal standard from Campbell v. Acuff-Rose. IP Watchdog reported in February 2026 that litigation is increasingly paving the way to licensing: large publishers such as News Corp have secured deals with OpenAI worth hundreds of millions of dollars, while smaller and reference publishers often lack the leverage to negotiate. The Merriam-Webster and Britannica suit fits that pattern: reference works are part of the shared linguistic and factual infrastructure, yet they were used without a licence. The Bartz v. Anthropic settlement in September 2025—roughly $1.5 billion after a court held that training on pirated books was not fair use—shows that courts are willing to attach serious financial consequences to how training data is sourced.

What This Actually Means

The Merriam-Webster and Britannica case is not just about two reference brands. It is about who gets to monetise the shared infrastructure of language and fact. If courts side with OpenAI on a broad fair-use theory, reference publishers and other small rightsholders will have little leverage; if they side with the plaintiffs, the cost and structure of AI training will change. Either way, the suit makes visible what was long implicit: the industry built on "publicly available data" has been feeding on works that were public in the sense of being readable, not in the sense of being free for commercial ingestion. The question of who owns the language we use is now squarely in front of the courts.

What is the lawsuit about?

Encyclopedia Britannica, Inc. and Merriam-Webster, Inc. sued OpenAI and related entities in the U.S. District Court for the Southern District of New York on 13 March 2026. The 44-page complaint alleges copyright infringement and trademark violations. The plaintiffs claim that OpenAI used close to 100,000 of their online articles to train ChatGPT without authorisation or payment, and that the model outputs verbatim or near-verbatim copies of their content. They also allege that ChatGPT diverts users who would otherwise visit the publishers' sites, and that OpenAI has misattributed inaccurate or incomplete outputs to them. OpenAI disputes the claims and asserts that training on publicly available data is protected by fair use.

Who are the plaintiffs?

Encyclopedia Britannica, Inc. publishes the Encyclopaedia Britannica and related reference products. Merriam-Webster, Inc. is the oldest dictionary publisher in the United States and publishes Merriam-Webster dictionaries. Both are represented by Susman Godfrey L.L.P. in the case. According to court filings, Merriam-Webster's corporate parent is Aletheia Holdings, LP; the same parent is identified for Encyclopedia Britannica, Inc. The case is docketed as 1:26-cv-02097 and has been filed as related to a larger multi-district litigation (1:25-md-03143) concerning OpenAI and copyright.

Sources

techcrunch.com, Pacer Monitor – Encyclopedia Britannica et al v. OpenAI, Ropes & Gray – AI training and copyright, The Guardian – Authors protest AI use of works, IP Watchdog – AI copyright and licensing

Related Video

Related video — Watch on YouTube
Read More News
Apr 24

How To Build A Legal RAG App In Weaviate

Apr 16

AI YouTube Clones Are Turning Professor Jiang’s Viral Rise Into A Conspiracy Machine

Apr 16

The Iran Ceasefire Is Turning Into A Maritime Pressure Campaign

Apr 16

China’s Taiwan Carrot Still Depends On Military Pressure

Apr 16

Putin’s Easter Ceasefire Shows Why Russia Still Controls The Timing

Apr 16

OpenAI’s Cyber Defense Push Shows GPT-5.4 Is Arriving With Guardrails

Apr 16

Meta’s Muse Spark Makes Subagents The New Face Of Meta AI

Apr 12

Your Fingerprints Are Now Europe’s First Gatekeeper: How a Digital Border Quietly Seized Unprecedented Control

Apr 12

Meloni’s Crime Wave Panic: A January Stabbing Becomes April’s Political Opportunity

Apr 12

Germany’s Noon Price Cap Is Economic Surrender Dressed as Policy Innovation

Apr 12

Germany’s Quiet Healthcare Revolution: How Free Lung Cancer Screening Reveals What’s Really Broken

Apr 12

France’s Buried Confession: Why Naming America as an Election Threat Really Means

Apr 12

The State as Digital Parent: Why the UK’s Teen Social Media Ban Is Actually Totalitarian

Apr 12

Starmer’s Crypto Ban Is Political Theater Hiding a Completely Different Story

Apr 12

Spain’s €5 Billion Emergency Response Will Delay Economic Pain, Not Prevent It

Apr 12

The Spanish Soldier Detention Reveals the EU’s Fractured Israel Strategy

Apr 12

Anthropic’s Mythos Reveals the Truth: AI Labs Now Possess Models That Exceed Human Capability

Apr 12

Polymarket’s Pattern of Suspiciously Timed Bets Reveals Systemic Information Asymmetry

Apr 12

Beyond Nostalgia: How Japan’s Article 9 Debate Reveals a Civilization Under Existential Pressure

Apr 12

Japan’s Oil Panic Exposes the Myth of Wealthy Nation Invulnerability

Apr 12

Brazil’s 2026 Rematch: The Election That Will Determine If Latin America Surrenders to the Left

Apr 12

Brazil’s Lithium Trap: How the Energy Transition Boom Could Destroy the Region’s Future

Apr 12

Australia’s Iran Refusal: A Sovereign Challenge to American Hegemony That Will Cost It Dearly

Apr 12

Artemis II’s Historic Return: The Moon Mission That Should Be Celebrated but Reveals Space’s True Purpose

Apr 12

Why the Netherlands’ Tesla FSD Approval Is a Regulatory Trap for Europe

Apr 12

The Dutch Government’s Shareholder Revolt Could Reshape Executive Compensation Across Europe

Apr 12

Poland’s Economic Success Cannot Prevent the Rise of Polexit and European Fragmentation

Apr 12

The Poland-South Korea Defense Partnership Is Quietly Reshaping European Security Architecture

Apr 12

North Korea’s Missile Tests Are Reactive—The Real Escalation Is Seoul’s Preemption Strategy

Apr 12

Samsung’s Record Earnings Are Real, But the Profits Vanish When You Understand the Costs

Apr 12

Turkey’s Radical Tobacco Ban Could Kill an Industry—But First It Will Consolidate Power

Apr 12

Turkey’s Balancing Act Is Breaking: Fitch Downgrade Reveals Currency Collapse Risk

Apr 12

Milei’s Libertarian Experiment Is Unraveling: Approval Hits Historic Low

Apr 12

Mexico’s Last Fossil Fuel Bet: Saguaro LNG Would Transform Mexico’s Energy Future—If It Survives Politics

Apr 12

Mexico’s World Cup Dream Meets Security Nightmare: 100,000 Troops Cannot Prevent Cartel War Bloodshed