The fight over whether ChatGPT can recite a dictionary definition is not a narrow licensing spat. It is a referendum on whether the language we all use every day—the definitions, the encyclopedic facts, the phrasing that reference publishers have curated for decades—belongs to anyone at all, or to the companies that swept it into training data without asking. When Merriam-Webster and Encyclopedia Britannica sued OpenAI in March 2026 in New York federal court, they did not only allege that nearly 100,000 of their articles had been copied to train ChatGPT. They exposed how AI builders have treated the written commons as free fuel while locking their own outputs behind terms that forbid anyone from doing the same to them.
Copyright lawsuits against OpenAI are really about who owns the language we use
According to the complaint filed in the Southern District of New York (Case 1:26-cv-02097) on 13 March 2026, OpenAI used Merriam-Webster and Britannica content to train its language models without permission or payment. The plaintiffs argue that ChatGPT produces verbatim or near-verbatim reproductions of definitions and encyclopedia entries, and that the system cannibalises traffic to their sites by answering queries that would otherwise send users to the publishers. As techcrunch.com reported, the dispute centres on almost 100,000 articles the plaintiffs say were used for training. OpenAI has responded that its models are trained on publicly available data and that their use is grounded in fair use—a defence that is under growing pressure in courts elsewhere.
The written commons were treated as free fuel
Britannica had attempted to negotiate licensing with OpenAI as early as November 2024, according to reporting on the case. Those overtures were rejected while OpenAI signed licensing deals with other publishers, creating a pattern where some rightsholders are paid and others are not. That asymmetry is at the heart of the "written commons" argument: the same language and reference material that schools, writers, and the public have relied on are now embedded inside a commercial product, with no cut for the institutions that compiled and maintained it. The complaint also includes trademark claims, accusing OpenAI of falsely attributing errors or incomplete answers to the publishers when the model hallucinates.
Fair use is no longer a safe haven for training
Legal precedent is shifting. In February 2025, a Delaware court in Thomson Reuters v. Ross Intelligence reversed an earlier ruling and held that using copyrighted material to train an AI system can constitute direct copyright infringement, rejecting the defendant's fair use defence. Ropes & Gray and other analysts have noted that this casts doubt on whether fair use will reliably shield AI companies from liability for training on copyrighted works. At the same time, the UK government was due to deliver an economic impact assessment by 18 March 2026 on proposed copyright changes that could allow AI firms to use protected work without permission unless owners opt out—a move that drew protests from thousands of authors who published a symbolic "Don't Steal This Book" in March 2026, as reported by The Guardian.
Expert commentary has sharpened. The Copyright Alliance and others have criticised some 2026 rulings that favoured AI companies for applying "woefully superficial" fair-use analysis, concluding that use is transformative simply because generative AI is new technology rather than applying the legal standard from Campbell v. Acuff-Rose. IP Watchdog reported in February 2026 that litigation is increasingly paving the way to licensing: large publishers such as News Corp have secured deals with OpenAI worth hundreds of millions of dollars, while smaller and reference publishers often lack the leverage to negotiate. The Merriam-Webster and Britannica suit fits that pattern: reference works are part of the shared linguistic and factual infrastructure, yet they were used without a licence. The Bartz v. Anthropic settlement in September 2025—roughly $1.5 billion after a court held that training on pirated books was not fair use—shows that courts are willing to attach serious financial consequences to how training data is sourced.
What This Actually Means
The Merriam-Webster and Britannica case is not just about two reference brands. It is about who gets to monetise the shared infrastructure of language and fact. If courts side with OpenAI on a broad fair-use theory, reference publishers and other small rightsholders will have little leverage; if they side with the plaintiffs, the cost and structure of AI training will change. Either way, the suit makes visible what was long implicit: the industry built on "publicly available data" has been feeding on works that were public in the sense of being readable, not in the sense of being free for commercial ingestion. The question of who owns the language we use is now squarely in front of the courts.
What is the lawsuit about?
Encyclopedia Britannica, Inc. and Merriam-Webster, Inc. sued OpenAI and related entities in the U.S. District Court for the Southern District of New York on 13 March 2026. The 44-page complaint alleges copyright infringement and trademark violations. The plaintiffs claim that OpenAI used close to 100,000 of their online articles to train ChatGPT without authorisation or payment, and that the model outputs verbatim or near-verbatim copies of their content. They also allege that ChatGPT diverts users who would otherwise visit the publishers' sites, and that OpenAI has misattributed inaccurate or incomplete outputs to them. OpenAI disputes the claims and asserts that training on publicly available data is protected by fair use.
Who are the plaintiffs?
Encyclopedia Britannica, Inc. publishes the Encyclopaedia Britannica and related reference products. Merriam-Webster, Inc. is the oldest dictionary publisher in the United States and publishes Merriam-Webster dictionaries. Both are represented by Susman Godfrey L.L.P. in the case. According to court filings, Merriam-Webster's corporate parent is Aletheia Holdings, LP; the same parent is identified for Encyclopedia Britannica, Inc. The case is docketed as 1:26-cv-02097 and has been filed as related to a larger multi-district litigation (1:25-md-03143) concerning OpenAI and copyright.
Sources
techcrunch.com, Pacer Monitor – Encyclopedia Britannica et al v. OpenAI, Ropes & Gray – AI training and copyright, The Guardian – Authors protest AI use of works, IP Watchdog – AI copyright and licensing