From ChatGPT to Getty v. Stability AI: A Running List of Key AI Lawsuits

Image: OpenAI

Law

From ChatGPT to Getty v. Stability AI: A Running List of Key AI Lawsuits

The rising adoption of artificial intelligence (“AI”) across industries (including fashion, retail. luxury, etc.) that has come about in recent years is bringing with it no shortage of lawsuits, as parties look to navigate the budding issues that these relatively new ...

August 19, 2024 - By TFL

From ChatGPT to Getty v. Stability AI: A Running List of Key AI Lawsuits

Image : OpenAI

Case Documentation

From ChatGPT to Getty v. Stability AI: A Running List of Key AI Lawsuits

The rising adoption of artificial intelligence (“AI”) across industries (including fashion, retail. luxury, etc.) that has come about in recent years is bringing with it no shortage of lawsuits, as parties look to navigate the budding issues that these relatively new models raise for companies and creators, alike. A growing number of lawsuits focus on generative AI, in particular, which refers to models that use neural networks to identify the patterns and structures within existing data to generate new content. Lawsuits are being waged against the developers behind some of the biggest generative AI chatbots and text-to-image generators, such as ChatGPT and Stability AI, and in many cases, they center on how the underlying models are trained, the data that is used to do so, and the nature of the user-prompted output (which is allegedly infringing in many cases), among other things. 

In light of the onslaught of legal questions that have come about in connection with the rise of AI, we take a high-level look at some of the most striking lawsuits that are playing out in this space and corresponding developments. They are listed by filing date …

Sept. 11, 2024: Gemini Data v. Google

Google is being sued for trademark infringement for allegedly hijacking a smaller but older company’s name for a rebrand of its Bard chatbot. According to the complaint that it filed with the U.S. District Court for the Northern District of California on September 11, Gemini Data claims that in February 2024, “without any authorization by Gemini Data, Google publicly announced a re-branding of its BARD AI chatbot tool to ‘GEMINI.'” As a sophisticated company, Gemini Data claims that Google “undoubtedly conducted a trademark clearance search prior to publicly re-branding its entire line of AI products, and thus was unequivocally aware of Gemini Data’s registered and exclusive rights to the ‘GEMINI’ brand. Yet, Google “made the calculated decision to bulldoze over Gemini Data’s exclusive rights without hesitation,” it claims.

While Gemini Data says that it “does not hold a monopoly over the development of generative AI tools, it does have exclusive rights to the ‘GEMINI’ brand for AI tools,” noting that it “took all the steps to ensure it created a unique brand to identify its AI tools and to subsequently protect that brand.” That did not stop Google from “unabashedly wield[ing] its power to rob Gemini Data of its cultivated brand … assuming a small company like Gemini Data would not be in a position to challenge a corporate giant wielding overwhelming power.”

With the foregoing in mind, Gemini Data sets out claims of federal and state law trademark infringement, false designation of origin, and unfair competition, and is seeking monetary damages, as well as injunctive relief.

Aug. 19, 2024: Andrea Bartz, et al. v. Anthropic PBC

Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson are suing Anthropic for copyright infringement in connection with its generative AI product, Claude. In the complaint that they lodged with the U.S. District Court for the Northern District of California on August 19, the author plaintiffs allege that Anthropic has built “a multibillion-dollar business by stealing hundreds of thousands of copyrighted books.” Rather than “obtaining permission and paying a fair price for the creations it exploits, Anthropic pirated them.” While the Constitution “recognizes the fundamental principle that creators deserve compensation for their work, Anthropic ignored copyright protections,” the plaintiffs argue, claiming that “an essential component of Anthropic’s business model and its flagship ‘Claude’ family of large language models is the largescale theft of copyrighted works.”

Delving into the damage created by Anthropic’s alleged infringement, Bartz, Graeber, and Johnson assert that as AI-powered models “have become more advanced and enabled to train on more and more copyrighted material, they are able to generate more content and more sophisticated content.” The result of this, they maintain, is that it is “easier than ever to generate rip-offs of copyrighted books that compete with the original, or at a minimum dilute the market for the original copyrighted work.” Anthropic’s Claude platform has been used to “generate cheap book content,” according to the plaintiffs, who claim that Claude “could not generate this kind of long-form content if it were not trained on a large quantity of books, books for which Anthropic paid authors nothing.”

In short: “The success and profitability of Anthropic is predicated on mass copyright infringement without a word of permission from or a nickel of compensation to copyright owners, including Plaintiffs here.” With the foregoing in mind, the plaintiffs set out a single claim of copyright infringement and are seeking certification of their class action, as well as monetary damages and injunctive relief.

Jun. 27, 2024: Center for Investigative Reporting, Inc. v. OpenAI, Inc., et al.

The oldest nonprofit newsroom in the country has filed suit against OpenAI and Microsoft, accusing the ChatGPT and Copilot creators of engaging in copyright infringement and violating the Digital Millenium Copyright Act (“DMCA”). In the complaint that it lodged with the U.S. District Court for the Southern District of New York on June 27, the Center for Investigative Reporting, Inc. (“CIR”) alleges that OpenAI and Microsoft (the “defendants”) are offering up AI products that “are built on uncompensated and unauthorized use of the creative works of humans.” Specifically, CIR claims (citing data from “award-winning website Copyleaks”) that “nearly 60% of the responses provided by the defendants’ GPT-3.5 product contained some form of plagiarized content, and over 45% contained text that was identical to pre-existing content.” 

According to CIR, the defendants “copied, used, abridged, and displayed [its] valuable content without [its] permission or authorization, and without any compensation to CIR,” thereby, “undermin[ing] and damag[ing] its relationship with potential readers, consumers, and partners, and deprive CIR of subscription, licensing, advertising, and affiliate revenue, as well as donations from readers.” 

Setting out claims of direct and contributory copyright infringement CIR argues that the defendants infringed its exclusive rights in its registered works by: “(1) downloading those works from the internet; (2) encoding the Registered Works in computer memory; (3) regurgitating those works verbatim or nearly verbatim in response to prompts by ChatGPT users; (4) producing significant amounts of material from those works in response to prompts by ChatGPT users; and (5) producing significant amounts of material from those works in response to prompts by ChatGPT users.” And in furtherance of its DMCA claims, CIR contends that the defendants “created copies of [its] works of journalism with copyright notice information removed.” 

May 16, 2024: Lehrman, et al. v. LOVO, Inc.

Voice-over actors Paul Lehrman and Linnea Sage have filed a right of publicity and false advertising lawsuit against LOVO, Inc., a startup in the business of selling “a text-to-speech subscription service that allows its clients – typically companies – to generate voice-over narrations at a fraction of the cost of the traditional model.” According to Lehrman and Sage’s complaint, LOVO enables “subscribing customers to upload a script into its AI-driven software … and generate a professional-quality voice-over based on certain criteria,” and that it “promotes its service using barely-disguised images and names of celebrities and states on its website, ‘Clone any voice.'”

“Implicit in LOVO’s offerings to its customers is that each voice-over actor has agreed to LOVO’s terms and conditions for customers to be able to access that,” Lehrman and Sage assert. The problem with that, they claim, is that they (and other members of the class) “have not agreed to LOVO’s terms,” and that LOVO has “stolen and used” their “voices and/or identities to create millions of voice-over productions without permission or proper compensation, in violation of numerous state right of privacy laws, and the federal Lanham Act.”

Apr. 30, 2024: Daily News, LP, et al. v. Microsoft Corp., et al.

A group of eight news publications have filed a copyright infringement and trademark dilution lawsuit against Microsoft and OpenAI in a New York federal court, alleging that the generative AI pioneer and its partner of “purloining millions of [their] copyrighted articles without permission and without payment to fuel the commercialization of their generative artificial intelligence products, including ChatGPT and Copilot.” The plaintiffs – which include Chicago Tribune Company, Orlando Sentinel Communications Company, and San Jose Mercury-News, among other newspapers – argue that while OpenAI and Microsoft pay for the other elements of their businesses, such as computers, specialized chips, electricity, and programmers and other technical employees, they have opted not to pay for the “high quality content” that they need “to make their GenAI products successful.”

“Despite admitting that they need copyrighted content to produce a commercially viable GenAI product,” the plaintiffs claim that OpenAI and Microsoft “contend that they can fuel the creation and operation of these products with the [plaintiffs]’ content without permission and without paying for the privilege.” But “they are wrong on both counts,” according to the plaintiffs, who set out claims of direct, vicarious, and contributory copyright infringement, violations of the DMCA, common law unfair competition by misappropriation, federal trademark dilution, and dilution and injury to business reputation under New York General Business Law.

Apr. 26, 2024: Zhang et al. v. Google LLC and Alphabet Inc.

A group of visual artists have filed suit against Google LLC and its owner Alphabet Inc. in a federal court in Northern California, alleging that the tech titans made unauthorized use of their copyright-protected artworks to train its AI-powered image generator, Imagen. Neither the plaintiffs nor any of the proposed class members ever authorized Google to use their copyrighted works as training material, according to the complaint, which states that “these copyrighted training images were copied multiple times by Google during the training process for Imagen.” And because Imagen “contains weights that represent a transformation of the protected expression in the training dataset, Imagen is, itself, an infringing derivative work.”

Meanwhile, the plaintiffs – who set out claims of direct copyright infringement against Google and vicarious copyright infringement against Alphabet – further assert that Alphabet, “as the corporate parent of Google, also commercially benefits from these acts of massive copyright infringement.”

Mar. 8, 2024: Nazemian, et al. v. NVIDIA Corp.

NVIDIA Corp. has landed on the receiving end of a copyright infringement complaint filed with the N.D. Cal. in March 8, with author-plaintiffs Abdi Nazemian, Brian Keene, and Stewart O’Nan (collectively, the “plaintiffs”) alleging that their copyright-protected books that “were included in the training dataset that NVIDIA has admitted copying to train its NeMo Megatron models.” In their brief complaint, in which they set out a single claim of direct copyright infringement, the plaintiffs assert that NVIDIA “has admitted training its NeMo Megatron models” on a copy of a dataset called, The Pile, and therefore, “necessarily also trained its NeMo Megatron models on a copy of Books3, because Books3 is part of The Pile.”

Since “certain books written by the plaintiffs are part of Books3, including the infringed works and NVIDIA necessarily trained its NeMo Megatron models on one or more copies of the infringed works,” they claim that NVIDIA directly infringing their copyrights.

Feb. 28, 2024: Raw Story Media, et al. v. OpenAI, Inc., et al.

The latest lawsuit to be failed against OpenAI comes by way of news outlets Raw Story Media, Inc. and AlterNet Media, Inc. (the “plaintiffs”), which accuse the generative AI giant of “repackag[ing]” their “copyrighted journalism work product” by way of the outputs from its popular ChatGPT platform. Setting the stage in their complaint, the plaintiffs claim that “at least some of the time, ChatGPT provides or has provided responses to users that regurgitate verbatim or nearly verbatim copyright-protected works of journalism without providing any author, title, or copyright information contained in those works,” while other times, it “provides or has provided responses to users that mimic significant amounts of material from copyright-protected works of journalism without providing any author, title, or copyright information contained in those works.”

Part of the problem here, according to the plaintiffs, stems from how OpenAI trains the models that power ChatGPT: “When they populated their training sets with works of journalism, [OpenAI] had a choice: they could train ChatGPT using works of journalism with the copyright management information protected by the Digital Millennium Copyright Act (‘DMCA’) intact, or they could strip it away.” OpenAI “chose the latter,” the plaintiffs assert, and “in the process, trained ChatGPT not to acknowledge or respect copyright, not to notify ChatGPT users when the responses they received were protected by journalists’ copyrights, and not to provide attribution when using the works of human journalists.”

As such, when ChatGPT provides outputs in response to user prompts, it “gives the impression that it is an all-knowing, ‘intelligent’ source of the information being provided, when in reality, the responses are frequently based on copyrighted works of journalism that ChatGPT simply mimics,” the plaintiffs maintain.

With the foregoing in mind, the plaintiffs set out a single claim under 17 U.S.C. § 1202(b)(1) of the DMCA, on the basis that OpenAI “created copies of [their] works of journalism with author information removed and included them in training sets used to train ChatGPT.”

Feb. 28, 2024: The Intercept Media, Inc. v. OpenAI, Inc.

The Intercept Media similarly waged DMCA claims against OpenAI and Microsoft, accusing the two companies, as well as a number of OpenAI affiliates of violating the DMCA by creating and using copies of its works of journalism “with author information removed and included them in training sets used to train ChatGPT.” Among other things, the Intercept claims in the complaint that it lodged with the U.S. District Court for the Southern District of New York that OpenAI and co. “had reason to know that ChatGPT would be less popular and would generate less revenue if users believed that ChatGPT responses violated third-party copyrights or if users were otherwise concerned about further distributing ChatGPT responses.”

This is at least because the defendants “were aware that they derive revenue from user subscriptions, that at least some likely users of ChatGPT respect the copyrights of others or fear liability for copyright infringement, and that such users would not pay to use a product that might result in copyright liability or did not respect the copyrights of others.”

Like Raw Media and AlterNet Media, the Intercept accuses OpenAI and co. of violating 17 U.S.C. § 1202(b)(1) of the DMCA by “creat[ing] copies of [its] works of journalism with author information removed and included them in training sets used to train ChatGPT.” The Intercept goes further, though, and sets out claims under 17 U.S.C. § 1202(b)(3) on the basis that the defendants “shared copies of [its] works without author, title, copyright, and terms of use information” with each other “in connection with the development of ChatGPT.”

Jan. 25, 2024: Main Sequence, et al. v. Dudesy LLC, et al.

On the heels of Dudesy, a media company in the business of creating AI-generated works, releasing an hour-long special featuring an AI-generated imitation of George Carlin’s voice on the Dudesy podcast’s YouTube channel on January 9, the late comedian’s estate has lodged right of publicity and copyright infringement claims in a federal court in California. According to the complaint, dated January 25, more than 16 years after Carlin’s death, Dudesy and its founders, comedian Will Sasso and writer Chad Kultgen, “took it upon themselves to ‘resurrect’ Carlin with the aid of AI.”

“Using Carlin’s original copyrighted works,” Dudesy LLC, Sasso, and Kultgen (collectively, “Dudesy” and/or “defendants”) “created a script for a fake George Carlin comedy special and generated a sound-alike of George Carlin to ‘perform’ the generated script,” according to Main Sequence, Ltd., Jerold Hamza as executor for the Estate of George Carlin, and Jerold Hamza in his individual capacity (collectively, “Carlin’s estate” and/or the “plaintiffs”). The plaintiffs assert that “none of the defendants had permission to use Carlin’s likeness for the AI-generated ‘George Carlin Special,’ nor did they have a license to use any of the late comedian’s copyrighted materials.”

Against that background, they set out claims of violation of rights of publicity under California common law and deprivation of rights of publicity under Cal. Civ. Code § 3344.1; they are taking issue with Dudesy’s use of Carlin’s “name, reputation, and likeness,” namely, their use of “generated images of Carlin, Carlin’s voice, and images designed to evoke Carlin’s presence on a stage.” The plaintiffs also set out a claim of federal copyright infringement, arguing that the defendants have “unlawfully used [the] plaintiffs’ copyrighted works for building and training a dataset for purposes of generating an output intended to mimic the plaintiffs’ copyrighted work (i.e., Carlin’s stand-up comedy).”

With the foregoing in mind, the plaintiffs are seeking monetary damages, as well as preliminary and permanent injunctive relief to bar Dudesy and co. “from directly committing, aiding, encouraging, enabling, inducing, causing, materially contributing to, or otherwise facilitating use of George Carlin’s copyrighted works to generate Dudesy Specials and any other contents created or disseminated by Dudesy, LLC relating to those Dudesy Specials.” Additionally, they want the court to order Dudesy to “immediately remove, take down, and destroy any video or audio copies (including partial copies) of the ‘George Carlin Special,’ wherever they may be located.”

UPDATED (Apr. 2, 2024): Main Sequence and Dudesy have settled suit, as indicated by their filing of a joint stipulation with the court consenting to a judgment and permanent injunction barring Dudesy and co. from “uploading, posting or broadcasting the [‘George Carlin: I’m Glad I’m Dead (2024) – Full Special’] on the Dudesy Podcast, or in any content posted to any website, account or platform (including, without limitation, YouTube and social media websites) controlled by [them].” The defendants are also barred from “using George Carlin’s image, voice or likeness on the Dudesy Podcast, or in any content posted to any website, account or platform … controlled by [them] without the express written approval of the plaintiffs.”

Jan. 5, 2024: Basbanes v. Microsoft Corp. and OpenAI, et al.

Journalists Nicholas Basbanes and Nicholas Ngagoyeanes (professionally known as “Nicholas Gage”) have filed a direct, vicarious, and contributory copyright infringement suit against Microsoft Corporation and OpenAI, Inc., along with an array of affiliated OpenAI entities, arguing that the defendants copied their work “to build a massive commercial enterprise that is now valued at billions of dollars.” In particular, the plaintiffs assert that Microsoft and OpenAI, “as sophisticated commercial entities, clearly decided upon a deliberate strategy to steal the plaintiffs’ copyrighted works to power their massive commercial enterprise … [without] paying for the inputs that make their LLMs, and which are thus plainly derivative works [that] result in an even higher profit margin for the defendants.”

Pointing to the AI-focused case waged against Microsoft and OpenAI as impetus for the lawsuit at hand, Basbanes and Ngagoyeanes assert that “shortly after The New York Times filed suit against these same defendants in this court, the defendants publicly acknowledged that copyright owners like the plaintiffs must be compensated for the defendants’ use of their work: ‘We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models.’”

Against that background, the plaintiffs state that they “seek to represent a class of writers whose copyrighted work has been systematically pilfered by the defendants,” seeking damages for “copyright infringement, the lost opportunity to license their works, and for the destruction of the market the defendants have caused and continue to cause to writers.” The plaintiffs are also seeking a permanent injunction “to prevent these harms from recurring.”

2023

Dec. 27, 2023: New York Times Company v. Microsoft Corp. and OpenAI, et al.

The New York Times is accusing OpenAI and partner Microsoft of copyright infringement, violations of the Digital Millennium Copyright Act, unfair competition by misappropriation, and trademark dilution in a new lawsuit. According to the complaint that it filed with the U.S. District Court for the Southern District of New York on December 27, the Times alleges that the defendants are on the hook for making “unlawful use of The Times’s work to create artificial intelligence products that compete with it [and that] threatens The Times’s ability to provide [trustworthy information, news analysis, and commentary].” The defendants’ generative AI tools “rely on large-language models that were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more,” the paper claims.

The Times contends that the defendants “insist that their conduct is protected as ‘fair use’ because their unlicensed use of copyrighted content to train GenAI models serves a new ‘transformative’ purpose,” but the paper argues that “there is nothing “transformative” about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it.” Moreover, it maintains that “because the outputs of the defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use.”

In addition to its copyright-centric claims, the Times sets out a trademark dilution cause of action, asserting that the defendants “have, in connection with the commerce of producing GenAI to users for profit throughout the United States, including in New York, engaged in the unauthorized use of The Times’s trademarks in outputs generated by [their] GPT-based products.” In particular, it alleges that the defendants’ “unauthorized use of The Times’s marks on lower quality and inaccurate writing dilutes the quality of The Times’s trademarks by tarnishment in violation of 15 U.S.C § 1125(c).” On this front, the New York Times claims that “at the same time as the defendants’ models are copying, reproducing, and paraphrasing [its] content without consent or compensation, they are also causing The Times commercial and competitive injury by misattributing content to The Times that it did not publish,” thereby, giving rise to “misinformation.”

While The Times “has attempted to reach a negotiated agreement with the defendants … to permit the use of its content in new digital products,” it claims that “the negotiations have not led to a resolution.”

Nov. 21, 2023: Sancton v. OpenAI Inc., Microsoft Corporation, et al.

Reporter Julian Sancton has filed suit against OpenAI and Microsoft, alleging that the tech titans “have built a business valued into the tens of billions of dollars by taking the combined works of humanity without permission.” In the complaint that he lodged with U.S. District Court for the Southern District of New York on November 21, Sancton claims that “rather than pay for intellectual property,” the defendants “pretend as if the laws protecting copyright do not exist.” Among such IP? Sancton’s book, Madhouse at the End of the Earth, along with “thousands, maybe more, [of other] copyrighted works – including nonfiction books,” which the defendants allegedly used to train their AI models. The problem, per Sanction is that “the U.S. Constitution protects the fundamental principle that creators” – including “nonfiction authors, [who] often spend years conceiving, researching, and writing their creations” – “deserve compensation for their works.”

The bottom line in the complaint, in which Sancton sets out claims of direct and contributory copyright infringement, is that “the basis of the OpenAI platform is nothing less than the rampant theft of copyrighted works.”


This is a short excerpt from a Tracker that was published exclusively for TFL Pro+ subscribers. Inquire today about how to sign up for a Professional subscription and gain access to all of our exclusive content.

Updated

April 2, 2024

This article was initially published on June 5, 2023, and has been updated to reflect newly-filed lawsuits and updates in previously-reported cases.

related articles