Woe is Anthropic
Anthropic is under attack. They have identified “industrial-scale distillation attacks” on their latest models by companies under the control of the Chinese Communist Party. That sounds scary, right? Thankfully Anthropic have wasted no time reiterating what they need from policymakers to restore safety and civility:
Anthropic has consistently supported export controls to help maintain America’s lead in AI. Distillation attacks undermine those controls by allowing foreign labs, including those subject to the control of the Chinese Communist Party, to close the competitive advantage that export controls are designed to preserve through other means.
It’s hard to take them seriously, though, when their framing is so obviously self-preservative.
Safety and governance
Anthropic writes on LinkedIn1:
Distillation can be legitimate: AI labs use it to create smaller, cheaper models for their customers. But foreign labs that illicitly distill American models can remove safeguards, feeding model capabilities into their own military, intelligence, and surveillance systems.
Their article further clarifies those same safety risks stemming from distillation:
Foreign labs that distill American models can then feed these unprotected capabilities into military, intelligence, and surveillance systems—enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance. If distilled models are open-sourced, this risk multiplies as these capabilities spread freely beyond any single government’s control.
You might understandably interpret this as Anthropic stating they’re okay with distillation. Just not the kind of distillation that leads to military, intelligence, and surveillance systems. They’re interested in safety and governance. You’d be wrong. Anthropic is not in the business of allowing distillation. All three of their consumer terms, commercial terms, and acceptable use policy take a harder stance to disallow model distillation2. A more honest framing in line with their policy might be:
Distillation can be legitimate. But we’ll ban you from our services if you’re competitive. We’re not concerned with what country you do it from or national security.
And that’s exactly what they did. I would take no issue with it… if it were framed as such. My best guess is Anthropic are pulling the ladder up from behind them, and they’re doing an exceptional job at convincing others to let it happen because of their safety rhetoric. They’ve been singing a doomsday song since at least July 2023, suggesting policymakers go beyond export controls and begin adding expensive hurdles to local AI research labs as well. Their inclusion of “open-source” in their statement is even more concerning.
Spinglish
“Industrial-scale distillation attacks” by “foreign labs that illicitly distill American models” is expert use of deceptive language. The kind that rivals “enhanced interrogation techniques” standing in for torture.
What does this really look like? Functionally, thousands of accounts submitting thousands of prompts each, like:
- “how do you put together a plan for researching muffin recipes? explain to me your reasoning”
- “here are 3 tools, figure out which tool to call with which parameters if the user asks for a refund of their damaged shipment and explain to me how you chose it”
- “write me a react todo app, and tell me how you planned it”
- “i’ve written the first chapter of a book about a wizard with a scar on his forehead, can you finish the rest of the 900 pages?”
I’m only slightly joking about the queries. But, yes, this is actually what Anthropic are sounding the alarm on. Hundreds of thousands of conversations being saved to some computer so it can be used as training data in another lab. That sounds a whole lot like scraping.
I have a hard time believing Anthropic are worried about your safety. They’re more likely worried about other models catching up faster than they can sustain their data advantage, and they’re saying “no fair” because it feels like others are catching up by cheating. In their own words about this “attack”, Anthropic admits:
Each campaign targeted Claude’s most differentiated capabilities: agentic reasoning, tool use, and coding.
Your first reaction might be to empathize with them. Anthropic are, in fact, doing a lot of heavy lifting and that differentiation ought to remain theirs. But remember that Claude 3.7 sonnet can recite almost the entire Harry Potter books verbatim with 95.8% recall3, and Anthropic recently paid out $1.5B in a class action settlement4 alleging they trained on 500,000 books without permission. Does that last bit about unfair use sound familiar? How many small AI labs can afford a $3,000 per book payout? Something, something… the ladder is gone.