Schrödinger’s Training Clause: How Platforms Like WeTransfer Say They’re Not Using Your Files for AI—Until They Are

Tech companies want your content. Not just to host it, but for their training pipeline—to train models, refine algorithms, and “improve services” in ways that just happen to lead to new commercial AI products. But as public awareness catches up, we’ve entered a new phase: deniable ingestion.

Welcome to the world of the Schrödinger’s training clause—a legal paradox where your data is simultaneously not being used to train AI and fully licensed in case they decide to do so.

The Door That’s Always Open

Let’s take the WeTransfer case. For a brief period this month (in July 2025), their Terms of Service included an unmistakable clause: users granted them rights to use uploaded content to “improve the performance of machine learning models.” That language was direct. It caused backlash. And it disappeared.

Many mea culpas later, their TOS has been scrubbed clean of AI references. I appreciate the sentiment, really I do. But—and there’s always a but–the core license hasn’t changed. It’s still:

– Perpetual

– Worldwide

– Royalty-free

– Transferable

– Sub-licensable

They’ve simply returned the problem clause to its quantum box. No machine learning references. But nothing that stops it either.

 A Clause in Superposition

Platforms like WeTransfer—and others—have figured out the magic words: Don’t say you’re using data to train AI. Don’t say you’re not using it either. Instead, claim a sweeping license to do anything necessary to “develop or improve the service.”

That vague phrasing allows future pivots. It’s not a denial. It’s a delay. And to delay is to deny.

That’s what makes it Schrödinger’s training clause: Your content isn’t being used for AI. Unless it is. And you won’t know until someone leaks it, or a lawsuit makes discovery public.

The Scrape-Then-Scrub Scenario

Let’s reconstruct what could have happened–not saying it did happen, just could have–following the timeline in The Register:

1. Early July 2025: WeTransfer silently updates its Terms of Service to include AI training rights.

2. Users continue uploading sensitive or valuable content.

3. [Somebody’s] AI systems quickly ingest that data under the granted license.

4. Public backlash erupts mid-July.

5. WeTransfer removes the clause—but to my knowledge never revokes the license retroactively or promises to delete what was scraped. In fact, here’s their statement which includes this non-denial denial: “We don’t use machine learning or any form of AI to process content shared via WeTransfer.” OK, that’s nice but that wasn’t the question. And if their TOS was so clear, then why the amendment in the first place?

Here’s the Potential Legal Catch

Even if WeTransfer removed the clause later, any ingestion that occurred during the ‘AI clause window’ is arguably still valid under the terms then in force. As far as I know, they haven’t promised:

– To destroy any trained models

– To purge training data caches

– Or to prevent third-party partners from retaining data accessed lawfully at the time

What Would ‘Undoing’ Scraping Require?

– Audit logs to track what content was ingested and when

– Reversion of any models trained on user data

– Retroactive license revocation and sub-license termination

None of this has been offered that I have seen.

What ‘We Don’t Train on Your Data’ Actually Means

When companies say, “we don’t use your data to train AI,” ask:

– Do you have the technical means to prevent that?

– Is it contractually prohibited?

– Do you prohibit future sublicensing?

– Can I audit or opt out at the file level?

If the answer to those is “no,” then the denial is toothless.

How Creators Can Fight Back

1. Use platforms that require active opt-in for AI training.

2. Encrypt files before uploading.

3. Include counter-language in contracts or submission terms:

   “No content provided may be used, directly or indirectly, to train or fine-tune machine learning or artificial intelligence systems, unless separately and explicitly licensed for that purpose in writing” or something along those lines.

4. Call it out. If a platform uses Schrödinger’s language, name it. The only thing tech companies fear more than litigation is transparency.

What is to Be Done?

The most dangerous clauses aren’t the ones that scream “AI training.” They’re the ones that whisper, “We’re just improving the service.”

If you’re a creative, legal advisor, or rights advocate, remember: the future isn’t being stolen with force. It’s being licensed away in advance, one unchecked checkbox at a time.

And if a platform’s only defense is “we’re not doing that right now”—that’s not a commitment. That’s a pause.

That’s Schrödinger’s training clause.

David Sacks Is Learning That the States Still Matter

For a moment, it looked like the tech world’s powerbrokers had pulled it off. Buried deep in a Republican infrastructure and tax package was a sleeper provision — the so-called AI moratorium — that would have blocked states from passing their own AI laws for up to a decade. It was an audacious move: centralize control over one of the most consequential technologies in history, bypass 50 state legislatures, and hand the reins to a small circle of federal agencies and especially to tech industry insiders.

But then it collapsed.

The Senate voted 99–1 to strike the moratorium. Governors rebelled. Attorneys general sounded the alarm. Artists, parents, workers, and privacy advocates from across the political spectrum said “no.” Even hardline conservatives like Ted Cruz eventually reversed course when it came down to the final vote. The message to Big Tech or the famous “Little Tech” was clear: the states still matter — and America’s tech elite ignore that at their peril.  (“Little Tech” is the latest rhetorical deflection promoted by Big Tech aka propaganda.)

The old Google crowd pushed the moratorium–their fingerprints were obvious. Having gotten fabulously rich off of their two favorites: The DMCA farce and the Section 230 shakedown. But there’s increasing speculation that White House AI Czar and Silicon Valley Viceroy David Sacks, PayPal alum and vocal MAGA-world player, was calling the ball. If true, that makes this defeat even more revealing.

Sacks represents something of a new breed of power-hungry tech-right influencer — part of the emerging “Red Tech” movement that claims to reject woke capitalism and coastal elitism but still wants experts to shape national policy from Silicon Valley, a chapter straight out of Philip Dru: Administrator. Sacks is tied to figures like Peter Thiel, Elon Musk, and a growing network of Trump-aligned venture capitalists. But even that alignment couldn’t save the moratorium.

Why? Because the core problem wasn’t left vs. right. It was top vs. bottom.

In 1964, Ronald Reagan’s classic speech called A Time for Choosing warned about “a little intellectual elite in a far-distant capitol” deciding what’s best for everyone else. That warning still rings true — except now the “capitol” might just be a server farm in Menlo Park or a podcast studio in LA.

The AI moratorium was an attempt to govern by preemption and fiat, not by consent. And the backlash wasn’t partisan. It came from red states and blue ones alike — places where elected leaders still think they have the right to protect their citizens from unregulated surveillance, deepfakes, data scraping, and economic disruption.

So yes, the defeat of the moratorium was a blow to Google’s strategy of soft-power dominance. But it was also a shot across the bow for David Sacks and the would-be masters of tech populism. You can’t have populism without the people.

If Sacks and his cohort want to play a long game in AI policy, they’ll have to do more than drop ideas into the policy laundry of think tank white papers and Beltway briefings. They’ll need to win public trust, respect state sovereignty, and remember that governing by sneaky safe harbors is no substitute for legitimacy.  

The moratorium failed because it presumed America could be governed like a tech startup — from the top, at speed, with no dissent. Turns out the country is still under the impression they have something to say about how they are governed, especially by Big Tech.

The Patchwork They Fear Is Accountability: Why Big AI Wants a Moratorium on State Laws

Why Big Tech’s Push for a Federal AI Moratorium Is Really About Avoiding State Investigations, Liability, and Transparency

As Congress debates the so-called “One Big Beautiful Bill Act,” one of its most explosive provisions has stayed largely below the radar: a 10-year or 5-year or any-year federal moratorium on state and local regulation of artificial intelligence. Supporters frame it as a common sense way to prevent a “patchwork” of conflicting state laws. But the real reason for the moratorium may be more self-serving—and more ominous.

The truth is, the patchwork they fear is not complexity. It’s accountability.

Liability Landmines Beneath the Surface

As has been well-documented by the New York Times and others, generative AI platforms have likely ingested and processed staggering volumes of data that implicate state-level consumer protections. This includes biometric data (like voiceprints and faces), personal communications, educational records, and sensitive metadata—all of which are protected under laws in states like Illinois (BIPA), California (CCPA/CPRA), and Texas.

If these platforms scraped and trained on such data without notice or consent, they are sitting on massive latent liability. Unlike federal laws, which are often narrow or toothless, many state statutes allow private lawsuits and statutory damages. Class action risk is not hypothetical—it is systemic.  It is crucial for policymakers to have a clear understanding of where we are today with respect to the collision between AI and consumer rights, including copyright.  The corrosion of consumer rights by the richest corporations in commercial history is not something that may happen in the future.  Massive violations have  already occurred, are occurring this minute, and will continue to occur into the future at an increasing rate.  

The Quiet Race to Avoid Discovery

State laws don’t just authorize penalties; they open the door to discovery. Once an investigation or civil case proceeds, AI platforms could be forced to disclose exactly what data they trained on, how it was retained, and whether any red flags were ignored.

This mirrors the arc of the social media addiction lawsuits now consolidated in multidistrict litigation. Platforms denied culpability for years—until internal documents showed what they knew and when. The same thing could happen here, but on a far larger scale.

Preemption as Shield and Sword

The proposed AI moratorium isn’t a regulatory timeout. It’s a firewall. By halting enforcement of state AI laws, the moratorium could prevent lawsuits, derail investigations, and shield past conduct from scrutiny.

Even worse, the Senate version conditions broadband infrastructure funding (BEAD) on states agreeing to the moratorium—an unconstitutional act of coercion that trades state police powers for federal dollars. The legal implications are staggering, especially under the anti-commandeering doctrine of Murphy v. NCAA and Printz v. United States.

This Isn’t About Clarity. It’s About Control.

Supporters of the moratorium, including senior federal officials and lobbying arms of Big Tech, claim that a single federal standard is needed to avoid chaos. But the evidence tells a different story.

States are acting precisely because Congress hasn’t. Illinois’ BIPA led to real enforcement. California’s privacy framework has teeth. Dozens of other states are pursuing legislation to respond to harms AI is already causing.

In this light, the moratorium is not a policy solution. It’s a preemptive strike.

Who Gets Hurt?
– Consumers, whose biometric data may have been ingested without consent
– Parents and students, whose educational data may now be part of generative models
– Artists, writers, and journalists, whose copyrighted work has been scraped and reused
– State AGs and legislatures, who lose the ability to investigate and enforce

Google Is an Example of Potential Exposure

Google’s former executive chairman Eric Schmidt has seemed very, very interested in writing the law for AI.  For example, Schmidt worked behind the scenes for the two years at least to establish US artificial intelligence policy under President Biden. Those efforts produced the “Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence“, the longest executive order in history. That EO was signed into effect by President Biden on October 30.  In his own words during an Axios interview with Mike Allen, the Biden AI EO was signed just in time for Mr. Schmidt to present that EO as what Mr. Schmidt calls “bait” to the UK government–which convened a global AI safety conference at Bletchley Park in the UK convened by His Excellency Rishi Sunak (the UK’s tech bro Prime Minister) that just happened to start on November 1, the day after President Biden signed the EO.  And now look at the disaster that the UK AI proposal would be.  

As Mr. Schmidt told Axios:

So far we are on a win, the taste of winning is there.  If you look at the UK event which I was part of, the UK government took the bait, took the ideas, decided to lead, they’re very good at this,  and they came out with very sensible guidelines.  Because the US and UK have worked really well together—there’s a group within the National Security Council here that is particularly good at this, and they got it right, and that produced this EO which is I think is the longest EO in history, that says all aspects of our government are to be organized around this.

Apparently, Mr. Schmidt hasn’t gotten tired of winning.  Of course, President Trump rescinded the Biden AI EO which may explain why we are now talking about a total moratorium on state enforcement which percolated at a very pro-Google shillery called R Street Institute, apparently by one Adam Thierer .  But why might Google be so interested in this idea?

Google may face exponentially acute liability under state laws if it turns out that biometric or behavioral data from platforms like YouTube Kids or Google for Education were ingested into AI training sets. 

These services, marketed to families and schools, collect sensitive information from minors—potentially implicating both federal protections like COPPA and more expansive state statutes. As far back as 2015, Senator Ben Nelson raised alarms about YouTube Kids, calling it “ridiculously porous” in terms of oversight and lack of safeguards. If any of that youth-targeted data has been harvested by generative AI tools, the resulting exposure is not just a regulatory lapse—it’s a landmine. 

The moratorium could be seen as an attempt to preempt the very investigations that might uncover how far that exposure goes.

What is to be Done?

Instead of smuggling this moratorium into a must-pass bill, Congress should strip it out and hold open hearings. If there’s merit to federal preemption, let it be debated on its own. But do not allow one of the most sweeping power grabs in modern tech policy to go unchallenged.

The public deserves better. Our children deserve better.  And the states have every right to defend their people. Because the patchwork they fear isn’t legal confusion.

It’s accountability.