Reference publishers spent years turning their content into licensed product for tech platforms. Now that AI has changed who benefits from that content, Encyclopedia Britannica and Merriam-Webster are suing OpenAI for copying nearly 100,000 articles to train ChatGPT. The twist is not that they want to be paid; it is that they spent decades licensing to aggregators and search engines before deciding the line should be redrawn.
Publishers spent years licensing to tech; AI forced them to redraw the line
Encyclopedia Britannica and its subsidiary Merriam-Webster sued OpenAI in the U.S. District Court for the Southern District of New York on March 13, 2026, filing the complaint late Friday evening. According to Courthouse News Service, the 44-page complaint alleges that OpenAI illegally copied their online articles at scale to train large language models that power ChatGPT. The plaintiffs claim ChatGPT generates verbatim or near-verbatim reproductions of their work and cannibalizes traffic with AI-generated summaries. Britannica states in the complaint that it approached OpenAI about licensing in November 2024; after that discussion, an OpenAI representative rebuffed further licensing outreach and the company continued copying their content despite signing deals with other publishers. OpenAI has said its models are trained on publicly available data and grounded in fair use.
As TechCrunch reported on March 16, 2026, Britannica retains the copyright to nearly 100,000 online articles that it says were scraped and used to train OpenAI’s LLMs without permission. The lawsuit also accuses OpenAI of violating the Lanham Act when it attributes factually incorrect hallucinations to the publishers. The complaint includes four counts of copyright infringement and one of trademark dilution. Britannica joins The New York Times, Ziff Davis, and more than a dozen newspapers in pursuing OpenAI over copyright. A similar Britannica and Merriam-Webster suit against Perplexity, filed in September 2025, remains pending.
Licensing history shows publishers enabled the ecosystem they now oppose
For decades, publishers monetized content by licensing it to aggregators, databases, and tech platforms. According to industry analysis from Encypher and Digital Content Next, licensing has successfully advanced technologies from recorded music and broadcast to the Internet. By 2024 and 2025, major publishers had already cut AI licensing deals: News Corp with OpenAI for more than $250 million over five years, Reddit with Google for about $60 million per year, and undisclosed agreements from the Associated Press, Financial Times, Axel Springer, Vox Media, and The Atlantic with OpenAI. The reference giants, by contrast, are now litigating after their own November 2024 outreach did not lead to a deal.
Fair use is under pressure from new precedent
OpenAI’s fair use defense is under pressure from recent rulings. In Thomson Reuters v. ROSS Intelligence, Judge Stephanos Bibas reversed his earlier position and held that using copyrighted legal materials to train AI does not qualify as fair use; the court found that training was not transformative and caused direct market harm. The court determined that thousands of training documents were directly copied from copyrighted headnotes. Lexology and Skadden have noted that this undercuts the argument that tech companies can freely use copyrighted works for AI training and supports content owners’ right to be paid when their work trains AI systems. TechCrunch has also reported that in one key case Anthropic convinced a federal judge that using content as training data could be transformative, but the court still found that illegally downloading millions of books warranted a $1.5 billion settlement for writers. Discovery in the New York Times case has shown instances of regurgitation where AI outputs near-verbatim copies of protected works, which further pressures the fair use defense.
What This Actually Means
Britannica and Merriam-Webster are not wrong to want compensation. The real story is that the industry spent years building a licensing economy that handed tech firms vast access to trusted content. AI did not create that habit; it changed who captures the value. The lawsuit is a belated attempt to reset the terms. Whether courts will treat reference works differently from news or books remains open, but the pattern is clear: publishers who licensed early are on one side, and those who did not cut deals are now trying to litigate their way to the same table. The outcome will shape how much AI firms pay for trusted reference content going forward.
What is the Britannica and Merriam-Webster lawsuit about?
Encyclopedia Britannica and Merriam-Webster sued OpenAI in New York federal court in March 2026, alleging that OpenAI copied nearly 100,000 of their online articles to train ChatGPT without permission or payment. The complaint was filed in the Southern District of New York. They claim ChatGPT sometimes outputs verbatim or near-verbatim copies of their content and that OpenAI falsely attributes hallucinations to them, harming their reputation and traffic. The suit includes four copyright infringement counts and one trademark dilution count. Britannica had previously sued Perplexity in September 2025 over similar scraping and copying of its content. OpenAI had not responded to TechCrunch’s request for comment before the March 16 publication.
How does this compare to other publisher lawsuits against OpenAI?
The publishing industry has split into two camps: those pursuing licensing deals with AI companies, such as the Associated Press and Axel Springer, and those litigating to protect their IP, such as The New York Times. According to NPR and the Economic Times, a federal judge ruled in March 2025 that The New York Times and other newspapers could proceed with a copyright lawsuit against OpenAI and Microsoft; core infringement claims survived. Reuters reported in March 2026 that the Big Five book publishers and academic publishers sued Anna’s Archive, a shadow library allegedly supplying pirated content to AI developers. Ziff Davis, which owns Mashable, CNET, and IGN, has also sued OpenAI; a December 2025 ruling confirmed that copyright claims based on AI-generated outputs are viable. The Britannica and Merriam-Webster case is part of this wave, but it targets reference and dictionary content rather than news or books.
Sources
Courthouse News Service, TechCrunch, Digital Content Next, Lexology