ChatGPT Maker OpenAI Used Stolen Information, Suits Allege

By Kevin TruongPublished Jun. 30, 2023 • 12:06pm

OpenAI has been hit with two class action complaints filed in U.S. District Court in San Francisco this week over what is alleged to be the improper use of personal and copyrighted data in the development of ChatGPT.

The Clarkson Law Firm filed a class action complaint on June 28, claiming that OpenAI’s ChatGPT training dataset violated data privacy and copyright laws by scraping social media data, financial information, private user conversations and private health information.

The suit alleges that ChatGPT utilizes “stolen private information, including personally identifiable information, from hundreds of millions of internet users, including children of all ages, without their informed consent or knowledge.”

The complaint states that by using this data, OpenAI and its related entities have enough information to replicate digital clones, encourage people’s “professional obsolescence” and “obliterate privacy as we know it.”

The complaint lists several plaintiffs identified by their initials, including a software engineer who claims that his online posts around technical questions could be used to eliminate his job, a 6-year-old who used a microphone to interact with ChatGPT and allegedly had his data harvested, and an actor who claims that OpenAI stole personal data from online applications to train its system.

Today’s stories straight to your inbox

Everything you need to know to start your day.

The lawsuit is seeking additional plaintiffs for the class action and the institution of additional safeguards for the system. In addition to stopping the alleged privacy violations, the suit seeks more transparency into OpenAI’s data collection and handling, damages and compensation for the use of personal data, and the ability for users to opt out of data collection.

Another lawsuit targeting OpenAI, filed on June 28 by Joseph Saveri Law Firm, names two Massachusetts-based writers, Paul Tremblay and Mona Awad, as plaintiffs.

The lawsuit also seeks class action status, alleging copyright infringement because ChatGPT was trained on copywritten work without their consent and is profiting through the use of that material.

The complaint specifically mentions that ChatGPT generates summaries of the plaintiffs’ work when prompted and alleges the potential for thousands of similar class members across the country.

The technology that underpins ChatGPT is a large language model trained on a massive amount of data. ChatGPT uses this training dataset to generate new text outputs in response to user prompts.

Tremblay is an author of genre fiction, including the book The Cabin at the End of the World, which was made into the movie Knock at the Cabin by director M. Night Shyamalan. Awad is a novelist and an assistant professor at Syracuse University.

The lawsuit specifically alleges that OpenAI copied at least Tremblay’s book The Cabin at the End of the World and Awad’s books 13 Ways of Looking at a Fat Girl and Bunny.

The complaint is seeking a jury trial and the award of damages and attorney’s fees, in addition to permanent injunctive relief, including changes to ChatGPT.

OpenAI did not respond to a request for comment on the lawsuits.

ChatGPT maker OpenAI accused of misusing personal, copyrighted data

Today’s stories straight to your inbox

Filed Under

What do you eat at a San Francisco orgy? Caterers describe the spread

After an ’embarrassing’ start, Giants’ Justin Verlander gets candid about what’s wrong

The ‘economically rational’ scammer who duped 19 startups into hiring him

Behind the scenes with the weed-smoking, Labubu-loving, hackathon king of SF

We ate at all 27 restaurants at Stonestown — these are the mall’s best and worst