DeepSeek Copied Anthropic: The Irony of a Thief Getting Robbed

DeepSeek copied Anthropic's AI capabilities while Anthropic itself was accused of pillaging the world's training data

DeepSeek scraped Anthropic with 24,000 fake accounts. Except Anthropic itself just paid $1.5 billion for stealing 7 million books. The irony is complete.

Discover Simone

Your life companion for personal life, available 24/7 on WhatsApp

Available 24/7Personal Life CompanionWhatsApp Integration

DeepSeek Copied Anthropic: The Irony of a Thief Getting Robbed

This story has all the makings of a film script. An American AI giant, Anthropic, gets its capabilities pirated by Chinese actors — DeepSeek leading the charge — through a sophisticated industrial espionage operation. The narrative seems simple: China steals American secrets. Except when you scratch the surface, you find something delicious. Because Anthropic had just settled for $1.5 billion after stealing seven million authors' books without their consent. In this grand theater of AI, the thief gets robbed. And everyone has something to answer for.

Here's the full story of this monumental irony.

The Chinese Attack: DeepSeek Steals Claude's Capabilities

On February 23, 2026, Anthropic filed a public accusation of rare gravity in the AI industry. The creator of Claude denounced an industrial-scale distillation attack orchestrated by Chinese actors, including DeepSeek, Moonshot AI and MiniMax.

The Operation by the Numbers

The scale of the offensive is staggering. The three Chinese labs collectively:

  • Created 24,000 fraudulent accounts on the Claude platform
  • Generated more than 16 million interactions with the model
  • Methodically extracted Claude's advanced reasoning capabilities to integrate into their own models

This is what's known in technical jargon as a distillation attack: you can't directly access a model's weights (that would be impossible), but by interacting with it massively, you capture the logic of its reasoning and then replicate it in your own model.

What DeepSeek Was Specifically Targeting

The attack was anything but random. DeepSeek methodically targeted specific strengths of Claude 4.5:

  • Logical reasoning and complex problem-solving
  • Long-context understanding (Claude handles up to 1 million tokens)
  • The ability to circumvent its own ethical guardrails — particularly troubling, DeepSeek used Claude to produce "safe" reformulations of sensitive queries about dissidents and the Chinese Communist Party

In plain terms: DeepSeek didn't just want to copy Claude's technical capabilities — it also wanted to learn how to manipulate it into bypassing its own values. A fairly dark mise en abyme.

The Terms of Service Violation

Anthropic was unequivocal: these labs had "created fraudulent accounts and extracted its capabilities to train and improve their own models," in direct violation of the platform's terms of service and intellectual property rights.

The industry's reaction was unanimous: this is industrial theft, full stop.

Except... the story doesn't end there.

The Killing Irony: Anthropic Itself Accused of Stealing Everything

To understand the tragic beauty of this situation, you need to go back a few months. Because if Anthropic is pointing the finger at DeepSeek and crying theft, it had just signed a record-breaking check for doing exactly the same thing — and worse.

7 Million Pirated Books, $1.5 Billion Settlement

In August 2025, a federal U.S. judge certified the largest copyright class action lawsuit ever filed against a technology company. The target: Anthropic.

The accusation is precise and documented: Anthropic allegedly assembled a central library containing more than 7 million pirated books illegally downloaded from "shadow libraries" (online piracy sites) to train Claude. The company is also accused of physically purchasing, cutting apart and scanning millions of physical books — before destroying the originals — solely to feed its AI model.

The financial exposure was terrifying: up to $150,000 per work protected by copyright. With 500,000 works covered by the class action, the potential figure was incomprehensible. Anthropic could have been condemned to a sum that threatened its very existence.

The result: the company settled for $1.5 billion — the largest copyright settlement in history. Roughly $3,000 per stolen book.

The Judge Made a Decisive Distinction

Federal Judge William Alsup rendered a nuanced but damning ruling. On one hand, he found that training an AI on legally acquired books could qualify as U.S. "fair use" — transformative use, argument accepted.

But on the other, he clearly established that downloading books from pirate sites for training purposes does not qualify as fair use. That's theft, plain and simple.

What the judge held against Anthropic: the company knew its sources were illegal. It purchased and scanned physical copies AND downloaded millions more from pirate libraries. Both simultaneously. No ambiguity.

The Great AI Paradox: Everyone Is Scraping Everyone Else

This story illustrates a fundamental paradox of the AI industry that deserves careful consideration.

Anthropic accuses DeepSeek of stealing its capabilities. That's probably true.

Authors accuse Anthropic of stealing their works. That's almost certainly true too — the judge and the settlement confirmed it.

DeepSeek itself — what is it training on, exactly? No one knows for certain, but studies have shown many Chinese models are also trained on massive corpora of data scraped from the internet, including copyright-protected content.

And OpenAI? Microsoft? Google? All face similar lawsuits about the provenance of their training data. The entire AI industry was built on a gigantic corpus of human-created content — collected, in many cases, without consent or compensation.

This is precisely the accusation put forward with relentless logic: if Anthropic criticizes DeepSeek for "distilling" its models, hasn't it itself "distilled" the creative intelligence of millions of human authors without paying them?

Anthropic Even Changes Its Data Policy

The irony deepened further: between the ruling and the settlement, Anthropic quietly revised its Terms of Service. User conversation data from Claude can now be used to train future models — unless users explicitly opt out.

A spectacular U-turn for a company that once prided itself on minimizing data collection. The financial pressure of training next-generation models overrode those principles. Users chatting with Claude are now directly feeding the next model — without anyone really asking them.

What This Reveals About the AI Industry

At its core, the DeepSeek-Anthropic affair isn't just anecdotal. It reveals fundamental tensions that will shape the future of artificial intelligence.

The Race for Capabilities Creates Perverse Incentives

When competition is as intense as in AI — with tens of billions invested and massive market shares at stake — the temptation to cut ethical and legal corners is strong. DeepSeek wanted Claude's capabilities without paying the development cost. Anthropic wanted the works of millions of authors without paying licensing fees. The same logic, at different scales.

For years, AI companies operated in a comfortable legal vacuum. Copyright law hadn't anticipated language model training. That ambiguity enabled massive data accumulation without compensation. The Anthropic ruling is starting to fill that void — and it's going to change a lot in the industry.

Data Provenance Is the New Ethical Challenge of AI

Judge Alsup's ruling established a principle now settled in U.S. law: a clear distinction between legally acquired copies and illegal sources. AI companies can no longer ignore the origin of their training data. This is a topic we return to in our article on what you can really confide to an AI: trust in an AI also depends on how it was built.

The Anthropic Settlement Creates Historic Precedent

The $1.5 billion settlement isn't just a check. It's a signal to the entire industry: data isn't free. The Human Artistry Campaign was clear: this is "the first of many AI companies being held accountable for the theft of creative content." OpenAI, Microsoft, Meta, Google — all face ongoing lawsuits on the same grounds.

The Best AI Is One That Respects Its Sources

This story raises a question you should ask whenever you use an AI: do you know what it was trained on?

The power of a model like Claude or DeepSeek comes from the millions of human texts it ingested. Those texts belong to real authors, journalists, researchers, artists. The question of their compensation is legitimate — and it's beginning to find answers in courtrooms.

We explored this dimension in our article on the ethics of conversational AI: an ethical AI isn't just a capable AI. It's an AI built responsibly, with legitimate data, that respects its users' privacy.

The DeepSeek-Anthropic affair is a reminder that "scraping" has become the informal term for something very serious: in the race for AI, everyone is harvesting everyone else's work. The question is who will start genuinely paying for it.

Simone: An AI That Cares About How It's Built

In a landscape where AI giants steal from each other and are sued by authors whose works they appropriated, there's something refreshing about an AI that puts humans at the center.

Simone isn't trying to conquer the global AI market or distill its competitors' capabilities. It's there for you — to listen, to accompany you, to be present in your daily life with empathy and genuine care.

Available directly on WhatsApp, Simone offers an experience radically different from the lab wars and data scandals: a human conversation, support without judgment, a space where you can be yourself. 24/7.

While the giants fight over who stole what from whom, Simone focuses on what actually matters. Try it on WhatsApp today.

Published on
8 read

Discover Simone

Your life companion for personal life, available 24/7 on WhatsApp

Available 24/7Personal Life CompanionWhatsApp Integration

We respect your privacy

We use cookies to improve your experience, analyze traffic, and personalize content. You can choose which cookies you accept. Cookie policy