Jonathan Raa | Nurphoto | Getty Images
Anthropic, an AI startup supported by Amazon, faced a class-action lawsuit in California federal court on Monday for alleged copyright violations. Three authors claimed that Anthropic “built a multibillion-dollar business by stealing hundreds of thousands of copyrighted books,” including their own. Founded by former OpenAI research executives, Anthropic is also backed by Google and Salesforce.
Authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson accused Anthropic of basing its business model and its flagship “Claude” family of large language models (LLMs) on the large-scale theft of copyrighted works. They alleged that Anthropic downloaded pirated versions of their books, copied them, and used these copies to train its models. This lawsuit comes after Anthropic’s June release of its most advanced AI model, Claude 3.5 Sonnet, which has gained popularity alongside other chatbots like OpenAI’s ChatGPT and Google’s Gemini.
The lawsuit asserts that copyright law prohibits Anthropic’s actions of downloading and copying hundreds of thousands of copyrighted books from illegal websites. Anthropic has not yet responded to requests for comment.
This case follows another lawsuit from last October, where Universal Music sued Anthropic for “systematic and widespread infringement of their copyrighted song lyrics,” according to a filing in a Tennessee federal court. Other music publishers, including Concord and ABKCO, were also plaintiffs. Universal Music’s lawsuit cited an instance where Anthropic’s AI chatbot Claude generated an “almost identical copy” of the lyrics to Katy Perry’s song “Roar,” violating Concord’s copyright. The lawsuit also mentioned Gloria Gaynor’s “I Will Survive,” for which Universal owns the rights.
The lawsuit claims that Anthropic unlawfully copies and disseminates vast amounts of copyrighted works in building and operating its AI models. It argues that, like other technological developers before them, AI companies must adhere to the law.
As the news industry struggles to maintain revenue for its costly newsgathering operations, many media outlets are aggressively protecting their businesses against the rise of AI-generated content. The Center for Investigative Reporting, the oldest nonprofit newsroom in the U.S., sued OpenAI and its main backer Microsoft in June for alleged copyright infringement, following similar lawsuits from The New York Times, The Chicago Tribune, and The New York Daily News.
In December, The New York Times filed a lawsuit against Microsoft and OpenAI, alleging intellectual property violations related to its journalistic content being used in ChatGPT training data. The Times seeks “billions of dollars in statutory and actual damages” for the “unlawful copying and use of the Times’s uniquely valuable works,” according to a filing in the U.S. District Court for the Southern District of New York. OpenAI disagreed with the Times’ characterization of events. The Chicago Tribune and seven other newspapers filed a similar lawsuit in April.
Outside of news, prominent U.S. authors like Jonathan Franzen, John Grisham, George R.R. Martin, and Jodi Picoult sued OpenAI last year, alleging copyright infringement for using their work to train ChatGPT.
However, not all news organizations are preparing for legal battles; some are partnering with AI startups. On Tuesday, OpenAI announced a partnership with Condé Nast, where ChatGPT and SearchGPT will display content from Vogue, The New Yorker, Condé Nast Traveler, GQ, Architectural Digest, Vanity Fair, Wired, Bon Appétit, and other outlets.
In July, Perplexity AI introduced a revenue-sharing model for publishers after facing plagiarism accusations. Media outlets and content platforms like Fortune, Time, Entrepreneur, The Texas Tribune, Der Spiegel, and WordPress.com were the first to join the company’s “Publishers Program.”
In June, OpenAI and Time magazine announced a “multi-year content deal” allowing OpenAI to access current and archived articles from over 100 years of Time’s history. OpenAI will display Time’s content within its ChatGPT chatbot and use it to enhance its products, likely for training its AI models. OpenAI announced a similar partnership in May with News Corp., granting access to current and archived articles from The Wall Street Journal, MarketWatch, Barron’s, the New York Post, and other publications. Reddit also announced in May that it will partner with OpenAI, allowing the company to train its AI models on Reddit content.