The Data Center Backlash Has Arrived

For years, the political conversation around AI data centers followed a familiar script that was straight out of the Chamber of Commerce. Governors competed to announce the next hyperscale campus. Counties rezoned farmland and conservation land into heavy industrial corridors. Legislatures approved enormous tax abatements with little debate. Utilities promised “economic development.” And local officials were told that if they moved too slowly, some other state would take the project instead. Kind of like because China.

Residents in Crowell, Texas are being forced to live with constant artificial daylight because of Google’s AI data center that is being built right next to them. Residents report severe 24/7 light pollution that creates artificial daylight at night (photo proof shown)

Why? Because even 10 years ago it was self-evidently true that there was no political opposition to Big Tech and nobody looked too hard at the reality of data centers in the places we had observable data like Oregon, for example. If they had, they would have known there was one thing that was absolutely true—data centers were not factories and they produced higher electric bills and fewer jobs. At least once the sugar high of construction had passed.

And speaking of jobs, in a November 2025 difference-in-differences study, economist Michael J. Hicks examined every data center opened in Texas and found zero statistically significant net employment effect — job gains in the data center sector were fully offset by losses in other industries, yielding an average treatment effect of roughly 46 workers per facility that the author concludes is “correctly interpreted as zero,” less than one-tenth the jobs generated by a single Walmart Supercenter. 

Good Jobs First has found that the three states that have measured their data center return on investment lose 52 to 91 cents on the dollar, and in Virginia alone, the sales and use tax exemption for data centers consumed 81.3% of the state’s entire economic development incentives budget in FY 2024.

But it’s not just light pollution. Even though it was patently obvious that the massive data centers that were getting built in Louisiana, Georgia, Utah and Nevada were vastly larger than the already operating data centers in Oregon and were guaranteed to chew up the environment way more, nobody bothered to put 2 and 2 together and check how deep the foundations were compared to local aquifers.

Just because she’s a socialist, doesn’t mean she’s wrong.

That script is now breaking down. I’m shocked, said no one.

As we told the UK Intellectual Property Office:

We call the IPO’s attention to the real-world example of the U.S. State of Oregon, a state that is roughly the geographical size of the UK.  Google built the first Oregon data centre in The Dalles, Oregon in 2006.  Oregon now has 125 of the very data centres that Big Tech will necessarily need to build in the UK to implement AI.  In other words, Oregon was sold much the same story that Big Tech is selling you today.

The rapid growth of Oregon data centres driven by the same tech giants like Amazon, Apple, Google, Oracle, and Meta, has significantly increased Oregon’s demand for electricity. This surge in demand has led to higher power costs, which are often passed on to local rate payers while data centre owners receive tax benefits.  This increase in price foreshadows the market effect of crowding out local rate payers in the rush for electricity to run AI—demand will only increase and increase substantially as we enter what the International Energy Agency has called “the age of electricity”.

Portland General Electric, a local power operator, has faced increasing criticism for raising rates to accommodate the encroaching electrical power needs of these data centers. Local residents argue that they unfairly bear the increased electrical costs while data centers benefit from tax incentives and other advantages granted by government. 

This is particularly galling in that the hydroelectric power in Oregon is largely produced by massive taxpayer-funded hydroelectric and other power projects built long ago. The relatively recent 125 Oregon data centres received significant tax incentives during their construction to be offset by a promise of future jobs.  While there were new temporary jobs created during the construction phase of the data centres, there are relatively few permanent jobs required to operate them long term as one would expect from digitized assets owned by AI platforms.

Of course, the UK has approximately 16 times the population of Oregon.  Given this disparity, it seems plausible that whatever problems that Oregon has with the concentration of data centers, the UK will have those same problems many times over due to the concentration of populations.

This message is getting through to elected officials around the world because citizens are freaking out.

Quietly at first, and then all at once, states and local governments across the country began pushing back. Some are freezing approvals entirely. Others are reconsidering billions in tax incentives. Some are demanding that data centers pay the real cost of the transmission infrastructure they require instead of socializing those costs onto ordinary ratepayers and anyone else who drinks water and breathes air.

This is no longer a niche zoning issue in Northern Virginia or some European bureaucratic nonsense. It is becoming a national political movement that has some real populist overtones worthy of a Brexiteer. According to the National Conference of State Legislatures (NCSL), at least 11 states have introduced statewide moratorium or ban legislation targeting data centers. Meanwhile, Good Jobs First reports more than 60 local moratorium efforts nationwidethat at least 14 states and scores of localities are failing to disclose tax abatement revenue losses they are suffering to data centers — even though they have been required to do so under Generally Accepted Accounting Principles (GAAP) since FY 2017.

The reasons vary by region as you’d suspect, but the themes are becoming remarkably consistent, many of which Artist Rights Institute raised in our comments on the US AI Action Plan and the UK IPO AI consultation:

• massive electricity demand;
• water consumption;
• transmission line expansion;
• opaque tax subsidies;
• industrialization of rural communities;
• secrecy surrounding the ultimate hyperscale users;
• and growing fear that ordinary households will subsidize AI infrastructure through higher utility bills.

What is striking is not merely the existence of resistance. It is the geographic breadth of it.

In Texas, lawmakers enacted new large-load interconnection rules while Hill County adopted a temporary construction pause and Agriculture Commissioner Sid Miller publicly called for broader scrutiny of data centers. In Virginia, long considered the unquestioned capital of the data center industry, legislators are openly debating whether to scale back tax exemptions that helped fuel “Data Center Alley.” In Illinois, Governor Pritzker proposed suspending new tax incentives entirely for two years.

Even places that aggressively courted data centers are beginning to hesitate.

In Reno, Nevada, officials adopted a pause on approving new data centers while they reevaluate land-use and infrastructure impacts. Duh. Ya think?

The Reno–Tahoe industrial corridor became a symbol of how quickly hyperscale development can transform an entire region once incentives and transmission infrastructure align. Nevada approved hundreds of millions in projected abatements over the last decade. Now local officials are asking whether the public actually understood the scale of what was being built. If you build it they will come, and they will take a huge dump in your backyard.

That same questions are emerging everywhere else: Who is the real end user? Who pays for the substations and 765-kV transmission lines? What happens if AI demand projections collapse halfway through construction? And why are local taxpayers subsidizing facilities that often employ surprisingly few permanent workers once operational? Well…not really surprisingly, but surprisingly if you believed the Chamber of Commerce hoorah.

The politics are changing because the physical footprint of AI is no longer abstract. The cloud is becoming visible. And you cannot bribe your way out of that one.

Pour some Sucre on them….

Residents now see the cooling towers. They see the transmission corridors. They hear the backup generators. In some communities they are learning about low-frequency industrial noise and infrasound issues that do not show up on ordinary decibel measurements. They see conservation land rezoned into industrial districts almost overnight. They see shell companies quietly assembling land while refusing to identify the ultimate hyperscale beneficiary.

Most importantly, they are beginning to understand that these projects are not temporary construction booms. They are permanent industrialization decisions. A 765-kV transmission corridor is not a pop-up startup. Neither is a hyperscale campus consuming as much electricity as a mid-sized city. And once the infrastructure is built, communities live with the consequences for generations.

The result is a new kind of political coalition that cuts across ideological lines. Environmental advocates, fiscal conservatives, rural landowners, grid-reliability hawks, and anti-subsidy activists are increasingly finding themselves on the same side of the debate. That does not mean the data center industry is stopping. Far from it. Billions are still flowing into AI infrastructure. Utilities continue planning enormous generation and transmission expansions. States remain eager for construction spending and property tax growth.

But the era of automatic approval is ending. The central political question is no longer whether AI infrastructure will expand. It is who bears the cost.

And there is another revealing development occurring at the federal level. What does it tell you that President Trump reportedly pulled back an executive-order framework that would have required certain AI labs to obtain government cybersecurity approval or clearance before launching advanced systems?

Whatever one thinks of the policy itself, the episode suggests intense behind-the-scenes conflict inside the administration and the AI industry over whether any meaningful federal guardrails should exist at all. Sources around Washington describe the push as a last-ditch effort by what critics derisively call the “Zombie AI Viceroy” David Sacks, the lobbyist who seemingly cannot be fired because the entire AI infrastructure race has become too politically and financially entangled. We will see whether federal safeguards reappear in another form. But at this moment, the practical reality is striking: the only governments actively imposing meaningful friction on AI infrastructure expansion are states, counties, and local municipalities.

State and Local Data Center Restriction / Tax Rollback Tracker (May 2026)

Alabama — Considering rules requiring data centers to bear infrastructure/grid costs

Arizona — Chandler pause; grid-cost proposals under consideration

California — Bills addressing ratepayer and environmental protections

Colorado — Denver moratorium; Larimer County pause; Logan County restrictions

Connecticut — Morris moratorium; Groton zoning restrictions

Florida — Enacted protections for local zoning authority and ratepayer safeguards

Georgia — HB 1059 introduced forbidding local permitting until December 2028; local pauses; estimated $2.5 billion per year in tax abatement revenue losses (highest in nation)

Illinois — Governor called for two-year pause of data center tax incentives

Indiana — Considering restructuring of tax incentive revenue sharing; fails to disclose data center costs despite ranking fifth-best in subsidy transparency nationally

Louisiana — New Orleans temporary moratorium

Maine — LD 307 moratorium on data centers over 20 MW (vetoed by Governor); local moratoria

Maryland — Proposed statewide approval restrictions (SB 931 / HB 1369)

Massachusetts — Lowell moratorium

Michigan — State moratorium proposals; Ypsilanti pause

Minnesota — Removed electricity sales tax exemption; created new annual energy-use fee; Minneapolis moratorium discussions

Nevada — Reno approval pause; growing tax-abatement controversy; Controller issues exemplary annual report of local revenue losses from state-awarded abatements

New Hampshire — HB 1265 one-year moratorium on data center construction (failed)

New Jersey — Millville ban/restrictions; prevailing wage requirement for data center construction (enacted February 2026)

New York — AB 10141 / SB 9144 statewide moratorium and Public Utility Commission rulemaking (introduced); Athens/Dryden/Mount Morris local restrictions

North Carolina — Chatham County moratorium; additional local reviews

North Dakota — Oliver County temporary moratorium activity

Ohio — Numerous local pauses; growing subsidy backlash

Oklahoma — SB 1488 moratorium until November 2029 (introduced); incentive rollback proposals

Oregon — Affordability/reliability proposals tied to large-load users

Pennsylvania — Moratorium discussions underway (HB 1370 introduced per NCSL)

South Carolina — SB 567 proposal to restrict approvals pending oversight framework (introduced)

South Dakota — SB 232 one-year statewide moratorium (introduced); local-control protections enacted

Texas — Large-load legislation; local moratoria and review fights; estimated $1 billion or more per year in tax abatement revenue losses; Hicks (2025) causal study found zero net job growth from data centers statewide

Vermont — S 205 proposed moratorium through 2030 with impact study requirement (introduced)

Virginia — HB 1515 prohibiting new approvals until interconnection requests fulfilled or July 2028 (continued); major debate over scaling back tax exemptions; estimated $1.94 billion per year in revenue losses; data center exemptions consumed 81.3% of state’s entire incentive budget in FY 2024

Washington — Restrictions tied to emissions-credit eligibility

Wisconsin — Moratorium proposal (status unverified; not listed in NCSL tracker)

The important point is not that every proposal will pass, which it may or may not. The important point is that resistance is no longer isolated. The backlash has become national. And resistance is not futile.

The AI Subsidy Is Over. Or Maybe It’s Just Beginning.


The current narrative says the “AI subsidy era” is ending. Prices are rising. Rate limits are tightening. Ads are creeping in. Enterprise tiers are replacing all-you-can-eat plans. In short: users will finally start paying what AI actually costs.

Haydon Field writing in The Verge tells us:

Earlier this month, millions of OpenClaw users woke up to a sweeping mandate: The viral AI agent tool, which this year took the worldwide tech industry by storm, had been severely restricted by Anthropic.

Anthropic, like other leading AI labs, was under immense pressure to lessen the strain on its systems and start turning a profit. So if the users wanted its Claude AI to power their popular agents, they’d have to start paying handsomely for the privilege.

“Our subscriptions weren’t built for the usage patterns of these third-party tools,” wrote Boris Cherny, head of Claude Code, on X. “We want to be intentional in managing our growth to continue to serve our customers sustainably long-term. This change is a step toward that.”

The announcement was a sign of the times. Investors have poured hundreds of billions of dollars into companies like OpenAI and Anthropic to help them scale and build out their compute. Now, they’re expecting returns. After years of offering cheap or totally free access to advanced AI systems, the bill is starting to come due — and downstream, users are beginning to feel the pinch.

That’s true but it’s leaving out a lot.

Yes, the consumer subsidy—venture-backed underpricing of inference—may be winding down. But the broader subsidy system that made AI possible isn’t going away. It’s expanding. Just ask President Trump.

To understand why, you have to go back to the last great digital disruption.

From P2P to Streaming to AI

Start with Napster.

P2P didn’t just enable infringement. It rewired expectations. It taught users that all music should be available, instantly, for free. Why? Because there was gold in them long tails. Forget about supply and demand, we had infinite supply so demand would take care of itself.

It’s for sale

Every artist, songwriter, label and publisher in the history of recorded music were not compensated for this shift. They were its involuntary financiers. Their catalogs created the demand, the network effects, and the user adoption that built the early internet music economy.

Streaming—think Spotify—didn’t reverse that logic. It formalized it. (Remember, streaming saved us from piracy and we should all be so grateful.) It actually transferred that involuntary financing from the p2p balance sheet to Spotify’s, and took it public.


Streaming platforms accepted a new baseline: the entire world’s repertoire must be available at all times, regardless of demand. That is a costly and structurally inefficient mandate, but it became the price of competing in a market shaped by P2P expectations. Licensing systems like the Mechanical Licensing Collective (MLC) were built to support that scale, but the underlying premise remained: total availability first, compensation second.

AI changes the game again.

AI Doesn’t Just Distribute Works. It Consumes Them.

P2P distributed music. Streaming licensed it. AI models ingest it.

That’s the critical difference.

Generative AI systems are trained on massive corpora that include copyrighted works, performances, and what we might call personhood signals—voice, style, tone, phrasing, and creative identity. These inputs are not just indexed or streamed. They are transmogrified (see what I did there) into model weights that can generate new outputs that compete with, mimic, or substitute for the originals.

So the role of the artist evolves:
    •    In P2P: unpaid distributor subsidy
    •    In streaming: underpaid inventory supplier
    •    In AI: uncompensated production input
That is not a marginal shift. It is a structural one.

The Real Subsidy Stack

When people say the “AI subsidy era is over,” they are usually talking about one thing: cheap access to compute.
But AI has always depended on a multi-layered subsidy stack:

    Creators – supply training data, cultural value, and identity signals without compensation or consent
    Users – supply prompts, feedback, and behavioral data that improve the models
    Communities – absorb land use, water consumption, and environmental costs
    Ratepayers – fund grid upgrades, transmission, and reliability for data center demand
    Venture capital – underwrites early losses to drive adoption and scale

The shift we are seeing now is not the end of subsidies. It’s a reallocation. Or as a cynic might say, it’s rearranging the deck chairs to hide the lifeboats.

Users may start paying more. But creators still aren’t being paid for training. Communities are still being asked to host infrastructure. And the physical footprint of AI is accelerating. Just ask President Trump.

The World Turned Upside Down

What makes this moment different is the scale of the buildout.
We are not just talking about apps anymore. We are talking about an industrial transformation:
    •    New data centers the size of small cities
    •    High-voltage transmission lines
    •    Water-intensive cooling systems
    •    Semiconductor supply chains
    •    And even discussions of new nuclear capacity to support compute demand

This is infrastructure on the scale of a national project, or more like national mobilization. But it is being built on top of a premise that has not been resolved: the uncompensated use of human creative work as training input.

That is the inversion: We are building power plants for systems that depend on not paying the people whose work makes those systems possible.

A Better Frame

The cleanest way to understand this is as a continuum:

P2P turned infringement into consumer expectation.
Streaming turned that expectation into platform infrastructure.
AI turns uncompensated authorship into industrial feedstock.

Or more bluntly:
The AI free ride is not ending. It is being re-invoiced. Users may now see higher prices. But the deeper subsidies—creative, environmental, and civic—remain off the books.

What Comes Next

If the industry is serious about “pricing AI correctly,” it cannot stop at compute.

It has to address:
    •    Compensation frameworks for training data
    •    Attribution and provenance standards
    •    Licensing models for style and voice
    •    Infrastructure cost allocation (who pays for the grid?)
    •    Governance of large-scale compute deployment

Otherwise, we are not exiting the subsidy era. We are doing what Big Tech lives for.

We are scaling it.

And this time, instead of a few server racks in a dorm room, we are building an global energy system around it.

Same Popcorn, Different Wrapper

In ancient Rome, Marcus Licinius Crassus was the wealthiest man alive. And he had a system. He owned real estate and he also owned the fire brigades. When a house caught fire, Crassus sent his men to the scene. But they didn’t rush in with water.

First, he made the owner an offer. Sell me your house for pennies. The house that is literally on fire. Agree to the price, and the fire would be put out. Refuse… and his fire brigade would simply watch it burn.

Some even whispered that Crassus’s men set fires themselves, just to create new ‘opportunities.’ Ya think?

It was ruthless. Ingenious. And it gave him his own kind of safe harbor. If you controlled the fire brigade… there was no liability. No regulator. No competition. Just profit. Because Crassus set the valuation.

Now—fast forward two thousand years. AI hyperscalers haven’t just rediscovered Crassus’s model. They’ve reimagined it.

The Valuation is the Thing

There is a moment in every cycle when the story stops even pretending to line up with the business. That moment usually shows up quietly at first, almost as a joke, and then all at once everyone realizes the joke is being taken seriously.

We may be there again.

Allbirds, a company that built its brand selling wool sneakers to a very specific kind of customer, is now pivoting into AI compute infrastructure. Not adjacent. Not evolutionary. Just a clean jump into GPUs and datacenters. The rebrand writes itself. NewBird AI.

If that sounds absurd, it should. But it should also feel familiar. The mistake is to focus on the technology. The technology is always real. The internet was real. AI is real. The mistake is to assume the valuation attached to that technology has anything to do with the underlying business. That part is almost always where things go sideways. The people. The ones who set the fires.

Fire Good, Valuations Bad

Look at the comps. Spotify sits around a one hundred billion dollar market cap. Universal Music Group is closer to thirty eight. Warner Music Group is around fifteen. The companies that own the music, the actual asset, the thing that endures, are worth a fraction of the company that packages and distributes it and will one day be replaced, just like streaming replaced CDs.

That is not a story about innovation. It is a story about what the market chooses to value.

Once you see that, the Allbirds pivot stops looking irrational. It starts looking like one of the only logical moves available. If the market assigns higher multiples to infrastructure, to platforms, to anything that can be described as scalable, then the rational response is to become that thing. Not because the company has any particular advantage in doing so, but because the category itself carries the valuation.

We have seen this movie before. In the late nineties, companies selling ordinary products wrapped themselves in the language of the internet. They were not retailers. They were platforms. They were not losing money. Oh no, no, no. They were scaling. They could IPO with four quarters of top line revenue. The technology stack became the story. The story became the valuation. The underlying business became almost incidental. Larry Ellison’s famous spoof Internet company, HeyIdiot.com was a “cash portal” that only sold one product, being shares of HeyIdiot.com stock at incrementally higher prices to even greater fools.

The systems built around those businesses grew increasingly complex. Layers of software justified layers of capital. At the same time, the basic economics often made less and less sense. Somewhere outside the pitch decks, the vulnerabilities were obvious. The infrastructure was fragile. The incentives were misaligned. But the narrative carried everything forward until it didn’t.

This cycle has its own vocabulary. Instead of platforms and portals, we have models and compute. Instead of e commerce infrastructure, we have GPU clusters. The words are different. The behavior is not.

But somebody’s AI is not in on the joke…

“Part of their exploration into new ideas within the tech industry?” Say what? Somebody’s not in on the joke.

The pattern is simple. Take something real and wrap it in something that can be described as infinite, like you know, shelf space for the long tail. The wrapper gets the multiple. The underlying asset becomes an input cost. Over time, the market forgets the difference. Particularly with help from Mary Meeker.

That is how you end up with a distributor valued above the content it distributes. It is how you end up with a sneaker company presenting itself as a datacenter operator. It is how each cycle convinces itself that it has broken from the last one when it is mostly repeating it with better branding.

Same popcorn. Different wrapper.

None of this requires believing that AI is not important. It is. None of this requires believing that compute does not matter. It does. The question is not whether the technology is real. The question is why the valuation attached to it keeps drifting so far from the businesses claiming it.

There is a point where companies stop explaining how they make money and start explaining what category they belong to. That is usually the point where the market has shifted from pricing businesses to pricing narratives.

When that happens, the incentives become clear. You do not need to build the best company. You need to be seen as the right kind of company. You need the HeyIdiot wrapper.

So no, this is not about the macro environment. It is not about timing the cycle or reading the tea leaves of innovation.

It is simpler than that.

It is the valuation, stupid.

And yes, it is still stupid. But as Crassus might tell you, the house is also still on fire, mofo. What do you want to do about it?

The Constitutional Shadow of the White House AI Framework: Law Without Law

One of the most important things about the White House AI framework released last week is what it is not.

It is not an executive order.

That may sound like a technical distinction, but it is doing an enormous amount of work here. Because by avoiding the form of an executive order, the framework avoids something even more important: Judicial review.

An executive order that attempted to declare AI training on copyrighted works lawful—or to constrain Congress from acting—would immediately invite challenge in the very judicial branch the framework also seeks to influence. Oh, that would be fun.

It would raise Administrative Procedure Act questions. It would trigger separation-of-powers scrutiny. It would likely be litigated within days.

This framework does none of that and is not susceptible to judicial challenge.

Instead, it achieves much of the same practical effect—shaping legal outcomes, constraining policy space, and signaling preferred doctrine—without creating a justiciable action. It is, in effect, law without law, and outcomes by positioning. Silicon Valley’s favorite.

Takings by Policy, Not Statute

Start with the most obvious constitutional issue: the Takings Clause of Fifth Amendment of the U.S. Constitution which states that “private property [cannot] be taken for public use, without just compensation.”

Copyright is a form of property. That is not controversial. It is a statutory property right grounded in the Constitution’s Intellectual Property Clause, and it carries exclusive rights that have long been understood as economically valuable.

Now consider what the White House framework does.

It declares that AI training—mass, indiscriminate ingestion of copyrighted works—as lawful. It does so without requiring compensation. And it does so in a context where the resulting systems can substitute for, or diminish the market for, the original works.

If that official policy position of the Executive Branch were enacted into law, it would raise a straightforward question:

Has the government authorized the use of private property for public and commercial purposes without compensation? Or more directly, has the Executive Branch just announced that will not prosecute that indiscriminate ingestion for any reason? Can we expect to see amicus briefs from the Solicitor General opposing copyright owners pursuing their rights in court?

That is sounding a lot like a taking.

But because the framework is not law, it avoids the moment where that question must be answered. It does not extinguish rights formally. It renders them economically hollow in practice, while leaving the formal structure intact.

That is the key move: functional elimination without formal abolition.

Ex Post Facto in Everything but Name

The framework also raises a second, less discussed issue: the logic of ex post facto lawmaking.

The Ex Post Facto Clause technically applies to criminal law. But the underlying principle is broader: the government should not change the legal consequences of past conduct to benefit favored actors or disadvantage others. Of course, copyright owners raising this argument will have the Spotify retroactive safe harbor in Title I of the Music Modernization Act thrown in their face as rank hypocrisy, which they would richly deserve, although as any 10 year old can tell you, two wrongs don’t make a right, at least in theory.

Here, the timeline matters.

  • Massive datasets have already been scraped.
  • Models have already been trained.
  • The conduct that enabled this may, in many instances, have been legally questionable—and in cases of willful infringement, potentially criminal under federal copyright law. Or if you listen to me, the largest case of criminal copyright infringement in history.

Now comes the policy years after the fact in the face of over 150 AI lawsuits all based on copyright infringement to one degree or another:

Training is lawful.

That looks less like interpretation and more like retroactive validation.

Even if framed as civil doctrine, the effect is similar to retroactive decriminalization of conduct tied to vested rights. It sends a clear message: conduct that may have been unlawful when undertaken will be treated as lawful because it is now economically indispensable to the broligarchs.

That is not how the rule of law is supposed to work.

Separation of Powers by Suggestion

The framework’s treatment of Congress is equally striking. It does not say Congress lacks authority to legislate. The President cannot say that. Well…he can, but there’s no foundation for the statement. The Constitution is clear: Congress defines copyright.

Instead, the framework says Congress should not act in ways that would affect judicial resolution of the training question.

That is an unusual formulation. Congress legislates in areas under litigation all the time. Indeed, it is often expected to clarify statutory ambiguity.

What the framework is doing is more subtle: It is attempting to shape the legislative field without formally constraining it.

And it pairs that with an implicit second message:

  • Legislation that restricts training or mandates licensing is inconsistent with executive policy.
  • Such legislation is therefore unlikely to be signed by the President. So why bring it?

That is a veto signal—delivered without the political cost of an actual veto.

Judicial Signaling Without Command

The same dynamic applies to the courts.

The framework claims to “defer” to the judiciary. But it simultaneously declares a preferred outcome: training is lawful.

That is not deference. That is signaling.

Judges are, of course, independent. But they do not operate in a vacuum. They are aware of executive priorities, legislative inaction, and market realities. When all three align around a single policy direction, it creates an interpretive gravitational force that is difficult to ignore.

And the signal travels further.

To lawyers.
To regulators.
To anyone whose career may intersect with executive appointment.

It normalizes what counts as a “reasonable” position within the current policy environment.

Prosecutorial Silence as Policy

There is also a more immediate, practical consequence.

While the framework does not have the force of law, it functions as an indirect directive to the Department of Justice. By declaring training lawful as a matter of policy, it signals that federal enforcement resources should not be used to pursue cases premised on the opposite view.

In effect, it tells prosecutors:

Do not spend time considering criminal enforcement for large-scale copyright violations tied to AI training. Do not spend time considering antitrust enforcement against the broligarchs. In fact, don’t spend any time prosecuting anyone regarding AI.

That matters because, for example, willful copyright infringement at scale can, in certain circumstances, give rise to criminal liability. I mean if that doesn’t, what does? Yet under this framework, even the possibility of such enforcement is quietly set aside.

This is not formal immunity. But in practice, it can look very similar.

Why “Not an Executive Order” Matters

If this were an executive order, all of these issues would be front and center:

  • Is this a taking?
  • Does it exceed executive authority?
  • Does it interfere with Congress?
  • Does it interfere with the Judiciary?

Because it is not and EO, these important issues remain in the background—present but untested.

That is the genius, and the danger, of the approach.

It allows the executive branch to:

  • Shape doctrine
  • Influence courts
  • Constrain Congress
  • Guide enforcement priorities
  • Normalize contested conduct

—all without triggering the mechanisms designed to check it.

The Constitutional Shadow

The AI framework does not violate the Constitution in any formal sense.

It does something more complicated.

It operates in the constitutional shadow—where policy can reshape rights, incentives, and expectations without ever crossing the line that would allow a court to say no.

But shadows matter.

Because by the time the law catches up—if it ever does—the world the Constitution was meant to govern and protect may already have changed.

Sony’s AI Music Attribution Tool: What It Actually Does (and What It Doesn’t)

As generative music systems like Suno and Udio move into the center of copyright debates, one question keeps coming up: Can we actually tell which songs influenced an AI-generated track? And then can we use that determination in a host of other processes like royalty payments?

Recently a number of people have pointed to research from Sony AI as evidence that the answer might be yes. Sony has publicly discussed work on tools designed to analyze the relationship between training data and AI-generated music outputs.

But the reality is a little more nuanced. Sony’s work is interesting and potentially important—but it is often misunderstood. What Sony has described is not a magic detector that can listen to a generated song and instantly reveal every recording the model trained on.

Instead, Sony is describing something more modest—and in some ways more useful.

Let’s unpack what the technology appears to do right now.

Two Problems Sony Is Trying to Solve

Sony AI has publicly discussed research in two related areas.

The first is training-data attribution. This means trying to estimate which recordings in a model’s training dataset influenced a generated output.

The second is musical similarity or version matching. This involves detecting when two pieces of music share meaningful musical material even if they are not exact copies of each other.

Sony has framed both efforts as research directions rather than a finished commercial product. In other words, this is still a developing technical approach, not a turnkey system that can produce definitive copyright answers.

Training Data Attribution in Plain English

The most relevant Sony work is a research project titled Large-Scale Training Data Attribution for Music Generative Models via Unlearning.

That title sounds intimidating, but the basic idea is fairly intuitive and also suggests the project is part of the broader machine unlearning academic discipline.

The system does not operate like Shazam. It does not simply listen to an AI-generated song and say:

“This track was trained on Song X, Song Y, and Song Z.”

Instead, the approach works more like this.

Imagine you already know—or at least suspect—which recordings were used to train the model. You have a candidate set of training tracks.

The system then asks:

Among these training recordings, which ones seem most likely to have influenced this generated output?

In other words, the system ranks influence among known candidates.

The research approach borrows from an area of machine learning called machine unlearning, which studies how particular training examples affect a model’s behavior. In simplified terms, researchers can test how the model behaves when certain training examples are removed or adjusted. If the output changes meaningfully, that suggests those examples had measurable influence.

The important point is that this is an influence-ranking tool, not a forensic detector.

It tries to answer:

“Which of these known training tracks mattered most?”

Not:

“Tell me every song the model was trained on.”

Sony’s Other Idea: Smarter Music Comparison

Sony has also described work on musical similarity detection.

Traditional audio fingerprinting systems—like those used by Shazam or Audible Magic—are very good at identifying identical recordings. If you upload the same song or a slightly altered version, the system can match it.

But generative AI raises a different problem. An AI output might resemble a song musically without copying the recording itself.

Sony’s research tries to detect those kinds of relationships.

For example, a system might notice that two tracks share melodic fragments, rhythmic patterns, harmonic progressions, or musical phrases even if the arrangement, production, or instrumentation is different.

In plain English, this kind of tool tries to answer a different question:

“Are these two pieces of music related in substance?”

Not:

“Are they the exact same recording?”

The Big Limitation: You Still Need the Training Dataset

Here’s the key limitation that often gets overlooked.

Sony’s attribution approach appears to depend on having access to the candidate training dataset.

The system works by comparing a generated output against recordings that are already known or suspected to have been used during training. It estimates influence among those candidates.

That means the system answers the question:

“Which of these training tracks influenced the output?”

But it does not answer the question:

“What unknown recordings were used to train this model?”

If the training corpus is hidden or undisclosed, the attribution system has nothing to test against.

This makes the technology conceptually similar to many machine-learning research experiments, which measure influence using known datasets. Researchers can test influence among known training examples, but they cannot reconstruct an unknown dataset from outputs alone.

What This Could Look Like in the Real World

If the training corpus were known, a practical workflow might look like this.

First, the recordings in the training corpus would be identified. Audio fingerprinting systems could match those recordings to commercial releases.

That step answers the question:

What copyrighted recordings appear in the training data?

Then an attribution tool like the one Sony describes could be used to analyze generated outputs and estimate which of those known recordings appear to have influenced them.

This would not prove copying in every case. But it could dramatically narrow the analysis—from millions of possible influences to a smaller list of likely candidates.

What Sony Has Not Claimed

Sony’s public statements do not suggest that the attribution problem is solved.

Sony has not announced a system that automatically calculates track-by-track royalty payments for AI-generated songs. Nor has it described a tool that conclusively proves copyright copying from an AI output alone.

Instead, the work is framed as research aimed at improving transparency and accountability in generative music systems.

Why Labels Might Still Be Interested

Even with these limitations, the idea could be attractive to rights holders.

If training datasets were known, attribution tools could theoretically support new ways of analyzing how music catalogs interact with generative AI systems.

For example, such tools might help support:

  • royalty allocation models
  • influence-weighted compensation frameworks
  • catalog analytics
  • AI audit trails showing how repertoire contributes to model behavior

In other words, the technology could potentially become a measurement tool for how music catalogs influence generative systems.

What Sony did and did not do (yet)

Sony’s work does not magically reveal every song an AI model trained on. And it does not eliminate the need to know what is in the training dataset.

Instead, its value appears to lie after the training data is known.

Once you have a candidate training corpus, tools like the ones Sony describes may help analyze which recordings influenced particular outputs.

That makes the technology best understood as a post-disclosure attribution layer, not a substitute for knowing what recordings were used in training in the first place.

Back to Commandeering Again: David Sacks, the AI Moratorium, and the Executive Order Courts Will Hate

Why Silicon Valley’s in-network defenses can’t paper over federalism limits.

The old line attributed to music lawyer Allen Grubman is, “No conflict, no interest.” Conflicts are part of the music business. But the AI moratorium that David Sacks is pushing onto President Trump (the idea that Washington should freeze or preempt state AI protections in the absence of federal AI policy) takes that logic to a different altitude. It asks the public to accept not just conflicts of interest, but centralized control of AI governance built around the financial interests of a small advisory circle, including Mr. Sacks himself.

When the New York Times published its reporting on Sacks’s hundreds of AI investments and his role in shaping federal AI and chip policy, the reaction from Silicon Valley was immediate and predictable. What’s most notable is who didn’t show up. No broad political coalition. No bipartisan defense. Just a tight cluster of VC and AI-industry figures from he AI crypto–tech nexus, praising their friend Mr. Sacks and attacking the story.

And the pattern was unmistakable: a series of non-denial denials from people who it is fair to say are massively conflicted themselves.

No one said the Times lied.

No one refuted the documented conflicts.

Instead, Sacks’ tech bros defenders attacked tone and implied bias, and suggested the article merely arranged “negative truths” in an unflattering narrative (although the Times did not even bring up Mr. Sacks’ moratorium scheme).

And you know who has yet to defend Mr. Sacks? Donald J. Trump. Which tells you all you need to know.

The Rumored AI Executive Order and Federal Lawsuits Against States

Behind the spectacle sits the most consequential part of the story: a rumored executive order that would direct the U.S. Department of Justice to sue states whose laws “interfere with AI development.” Reuters reports that “U.S. President Donald Trump is considering an executive order that would seek to preempt state laws on artificial intelligence through lawsuits and by withholding federal funding, according to a draft of the order seen by Reuters….”

That is not standard economic policy. That is not innovation strategy. That is commandeering — the same old unconstitutional move in shiny AI packaging that we’ve discussed many times starting with the One Big Beautiful Bill Act catastrophe.

The Supreme Court has been clear on this such as in Printz v. United States (521 U.S. 898 (1997) at 925): “[O]pinions of ours have made clear that the Federal Government may not compel the States to implement,by legislation or executive action, federal regulatory programs.”

Crucially, the Printz Court teaches us what I think is the key fact. Federal policy for all the United States is to be made by the legislative process in regular order subject to a vote of the people’s representatives, or by executive branch agencies that are led by Senate-confirmed officers of the United States appointed by the President and subject to public scrutiny under the Administrative Procedures Act. Period.

The federal government then implements its own policies directly. It cannot order states to implement federal policy, including in the negative by prohibiting states from exercising their Constitutional powers in the absence of federal policy. The Supreme Court crystalized this issue in a recent Congressional commandeering case of Murphy v. NCAA (138 S. Ct. 1461 (2018)) where the court held “[t]he distinction between compelling a State to enact legislation and prohibiting a State from enacting new laws is an empty one. The basic principle—that Congress cannot issue direct orders to state legislatures—applies in either event.” Read together, Printz and Murphy extend this core principle of federalism to executive orders.

The “presumption against preemption” is a canon of statutory interpretation that the Supreme Court has repeatedly held to be a foundational principle of American federalism. It also has the benefit of common sense. The canon reflects the deep Constitutional understanding that, unless Congress clearly says otherwise—which implies Congress has spoken—states retain their traditional police powers over matters such as the health, safety, land use, consumer protection, labor, and property rights of their citizens. Courts begin with the assumption that federal law does not displace state law, especially in areas the states have regulated for generations, all of which are implicated in the AI “moratorium”.

The Supreme Court has repeatedly affirmed this principle. When Congress legislates in fields historically occupied by the states, courts require a clear and manifest purpose to preempt state authority. Ambiguous statutory language is interpreted against preemption. This is not a policy preference—it is a rule of interpretation rooted in constitutional structure and respect for state sovereignty that goes back to the Founders.

The presumption is strongest where federal action would displace general state laws rather than conflict with a specific federal command. Consumer protection statutes, zoning and land-use controls, tort law, data privacy, and child-safety laws fall squarely within this protected zone. Federal silence is not enough; nor is agency guidance or executive preference.

In practice, the presumption against preemption forces Congress to own the consequences of preemption. If lawmakers intend to strip states of enforcement authority, they must do so plainly and take political responsibility for that choice. This doctrine serves as a crucial brake on back-door federalization, preventing hidden preemption in technical provisions and preserving the ability of states to respond to emerging harms when federal action lags or stalls. Like in A.I.

Applied to an A.I. moratorium, the presumption against preemption cuts sharply against federal action. A moratorium that blocks states from legislating even where Congress has chosen not to act flips federalism on its head—turning federal inaction into total regulatory paralysis, precisely what the presumption against preemption forbids.

As the Congressional Research Service primer on preemption concludes:

The Constitution’s Supremacy Clause provides that federal law is “the supreme Law of the Land” notwithstanding any state law to the contrary. This language is the foundation for the doctrine of federal preemption, according to which federal law supersedes conflicting state laws. The Supreme Court has identified two general ways in which federal law can preempt state law. First, federal law can expressly preempt state law when a federal statute or regulation contains explicit preemptive language. Second, federal law can impliedly preempt state law when Congress’s preemptive intent is implicit in the relevant federal law’s structure and purpose.

In both express and implied preemption cases, the Supreme Court has made clear that Congress’s purpose is the “ultimate touchstone” of its statutory analysis. In analyzing congressional purpose, the Court has at times applied a canon of statutory construction known as the “presumption against preemption,” which instructs that federal law should not be read as superseding states’ historic police powers “unless that was the clear and manifest purpose of Congress.”

If there is no federal statute, no one has any idea what that purpose is, certainly no justiciabile idea. Therefore, my bet is that the Court would hold that the Executive Branch cannot unilaterally create preemption, and neither can the DOJ sue states simply because the White House dislikes their AI, privacy, or biometric laws, much less their zoning laws applied to data centers.

Why David Sacks’s Involvement Raises the Political Temperature

As Scott Fitzgerald famously wrote, the very rich are different. But here’s what’s not different—David Sacks has something he’s not used to having. A boss. And that boss has polls. And those polls are not great at the moment. It’s pretty simple, really. When you work for a politician, your job is to make sure his polls go up, not down.

David Sacks is making his boss look bad. Presidents do not relish waking up to front-page stories that suggest their “A.I. czar” holds hundreds of investments directly affected by federal A.I. strategy, that major policy proposals track industry wish lists more closely than public safeguards, or that rumored executive orders could ignite fifty-state constitutional litigation led by your supporters like Mike Davis and egged on by people like Steve Bannon.

Those stories don’t just land on the advisor; they land on the President’s desk, framed as questions of his judgment, control, and competence. And in politics, loyalty has a shelf life. The moment an advisor stops being an asset and starts becoming a daily distraction much less liability, the calculus changes fast. What matters then is not mansions, brilliance, ideology, or past service, but whether keeping that adviser costs more than cutting them loose. I give you Elon Musk.

AI Policy Cannot Be Built on Preemption-by-Advisor

At bottom, this is a bet. The question isn’t whether David Sacks is smart, well-connected, or persuasive inside the room. The real question is whether Donald Trump wants to stake his presidency on David Sacks being right—right about constitutional preemption, right about executive authority, right about federal power to block the states, and right about how courts will react.

Because if Sacks is wrong, the fallout doesn’t land on him. It lands on the President. A collapsed A.I. moratorium, fifty-state litigation, injunctions halting executive action, and judges citing basic federalism principles would all be framed as defeats for Trump, not for an advisor operating at arm’s length.

Betting the presidency on an untested legal theory pushed by a politically exposed “no conflict no interest” tech investor isn’t bold leadership. It’s unnecessary risk. When Trump’s second term is over in a few years, Trump will be in the history books for all time. No one will remember who David Sacks was.

Too Dynamic to Question, Too Dangerous to Ignore

When Ed Newton-Rex left Stability AI, he didn’t just make a career move — he issued a warning. His message was simple: we’ve built an industry that moves too fast to be honest.

AI’s defenders insist that regulation can’t keep up, that oversight will “stifle innovation.” But that speed isn’t a by-product; it’s the business model. The system is engineered for planned obsolescence of accountability — every time the public begins to understand one layer of technology, another version ships, invalidating the debate. The goal isn’t progress; it’s perpetual synthetic novelty, where nothing stays still long enough to be measured or governed, and “nothing says freedom like getting away with it.”

We’ve seen this play before. Car makers built expensive sensors we don’t want that fail on schedule; software platforms built policies that expire the moment they bite. In both cases, complexity became a shield and a racket — “too dynamic to question.” And yet, like those unasked-for, but paid for, features in the cars we don’t want, AI’s design choices are too dangerous to ignore. (Like what if your brakes really are going out, not just the sensor is malfunctioning.)

Ed Newton-Rex’s point — echoed in his tweets and testimony — is that the industry has mistaken velocity for virtue. He’s right. The danger is not that these systems evolve too quickly to regulate; it’s that they’re designed that way designed to fail just like that brake sensor. And until lawmakers recognize that speed itself is a form of governance, we’ll keep mistaking momentum for inevitability.

AI Frontier Labs and the Singularity as a Modern Prophetic Cult

It gets rid of your gambling debts 
It quits smoking 
It’s a friend, it’s a companion 
It’s the only product you will ever need
From Step Right Up, written by Tom Waits

The AI “frontier labs” — OpenAI, Anthropic, DeepMind, xAI, and their constellation of evangelists — often present themselves as the high priests of a coming digital transcendence. This is sometimes called “the singularity” which refers to a hypothetical future point when artificial intelligence surpasses human intelligence, triggering rapid, unpredictable technological growth. Often associated with self-improving AI, it implies a transformation of society, consciousness, and control, where human decision-making may be outpaced or rendered obsolete by machines operating beyond our comprehension. 

But viewed through the lens of social psychology, the AI evangelists increasingly resembles that of cognitive dissonance cults, as famously documented in Dr. Leon Festinger and team’s important study of a UFO cult (a la Heaven’s Gate), When Prophecy Fails.  (See also The Great Disappointment.)

In that social psychology foundational study, a group of believers centered around a woman named “Marian Keech” predicted the world would end in a cataclysmic flood, only to be rescued by alien beings — but when the prophecy failed, they doubled down. Rather than abandoning their beliefs, the group rationalized the outcome (“We were spared because of our faith”) and became even more committed. They get this self-hypnotized look, kind of like this guy (and remember-this is what the Meta marketing people thought was the flagship spot for Meta’s entire superintelligence hustle):


This same psychosis permeates Singularity narratives and the AI doom/alignment discourse:
– The world is about to end — not by water, but by unaligned superintelligence.
– A chosen few (frontier labs) hold the secret knowledge to prevent this.
– The public must trust them to build, contain, and govern the very thing they fear.
– And if the predicted catastrophe doesn’t come, they’ll say it was their vigilance that saved us.

Like cultic prophecy, the Singularity promises transformation:
– Total liberation or annihilation (including liberation from annihilation by the Red Menace, i.e., the Chinese Communist Party).
– A timeline (“AGI by 2027”, “everything will change in 18 months”).
– An elite in-group with special knowledge and “Don’t be evil” moral responsibility.
– A strict hierarchy of belief and loyalty — criticism is heresy, delay is betrayal.

This serves multiple purposes:
1. Maintains funding and prestige by positioning the labs as indispensable moral actors.
2. Deflects criticism of copyright infringement, resource consumption, or labor abuse with existential urgency (because China, don’t you know).
3. Converts external threats (like regulation) into internal persecution, reinforcing group solidarity.

The rhetoric of “you don’t understand how serious this is” mirrors cult defenses exactly.

Here’s the rub: the timeline keeps slipping. Every six months, we’re told the leap to “godlike AI” is imminent. GPT‑4 was supposed to upend everything. That didn’t happen, so GPT‑5 will do it for real. Gemini flopped, but Claude 3 might still be the one.

When prophecy fails, they don’t admit error — they revise the story:
– “AI keeps accelerating”
– “It’s a slow takeoff, not a fast one.”
– “We stopped the bad outcomes by acting early.”
– “The doom is still coming — just not yet.”

Leon Festinger’s theories seen in When Prophecy Fails, especially cognitive dissonance and social comparison, influence AI by shaping how systems model human behavior, resolve conflicting inputs, and simulate decision-making. His work guides developers of interactive agents, recommender systems, and behavioral algorithms that aim to mimic or respond to human inconsistencies, biases, and belief formation.   So this isn’t a casual connection.

As with Festinger’s study, the failure of predictions intensifies belief rather than weakening it. And the deeper the believer’s personal investment, the harder it is to turn back. For many AI cultists, this includes financial incentives, status, and identity.

Unlike spiritual cults, AI frontier labs have material outcomes tied to their prophecy:
– Federal land allocations, as we’ve seen with DOE site handovers.
– Regulatory exemptions, by presenting themselves as saviors.
– Massive capital investment, driven by the promise of world-changing returns.

In the case of AI, this is not just belief — it’s belief weaponized to secure public assets, shape global policy, and monopolize technological futures. And when the same people build the bomb, sell the bunker, and write the evacuation plan, it’s not spiritual salvation — it’s capture.

The pressure to sustain the AI prophecy—that artificial intelligence will revolutionize everything—is unprecedented because the financial stakes are enormous. Trillions of dollars in market valuation, venture capital, and government subsidies now hinge on belief in AI’s inevitable dominance. Unlike past tech booms, today’s AI narrative is not just speculative; it is embedded in infrastructure planning, defense strategy, and global trade. This creates systemic incentives to ignore risks, downplay limitations, and dismiss ethical concerns. To question the prophecy is to threaten entire business models and geopolitical agendas. As with any ideology backed by capital, maintaining belief becomes more important than truth.

The Singularity, as sold by the frontier labs, is not just a future hypothesis — it’s a living ideology. And like the apocalyptic cults before them, these institutions demand public faith, offer no accountability, and position themselves as both priesthood and god.

If we want a secular, democratic future for AI, we must stop treating these frontier labs as prophets — and start treating them as power centers subject to scrutiny, not salvation.

AI Needs Ever More Electricity—And Google Wants Us to Pay for It

Uncle Sugar’s “National Emergency” Pitch to Congress

At a recent Congressional hearing, former Google CEO Eric “Uncle Sugar” Schmidt delivered a message that was as jingoistic as it was revealing: if America wants to win the AI arms race, it better start building power plants. Fast. But the subtext was even clearer—he expects the taxpayer to foot the bill because, you know, the Chinese Communist Party. Yes, when it comes to fighting the Red Menace, the all-American boys in Silicon Valley will stand ready to fight to the last Ukrainian, or Taiwanese, or even Texan.

Testifying before the House Energy & Commerce Committee on April 9, Schmidt warned that AI’s natural limit isn’t chips—it’s electricity. He projected that the U.S. would need 92 gigawatts of new generation capacity—the equivalent of nearly 100 nuclear reactors—to keep up with AI demand.

Schmidt didn’t propose that Google, OpenAI, Meta, or Microsoft pay for this themselves, just like they didn’t pay for broadband penetration. No, Uncle Sugar pushed for permitting reform, federal subsidies, and government-driven buildouts of new energy infrastructure. In plain English? He wants the public sector to do the hard and expensive work of generating the electricity that Big Tech will profit from.

Will this Improve the Grid?

And let’s not forget: the U.S. electric grid is already dangerously fragile. It’s aging, fragmented, and increasingly vulnerable to cyberattacks, electromagnetic pulse (EMP) weapons, and even extreme weather events. Pouring public money into ultra-centralized AI data infrastructure—without first securing the grid itself—is like building a mansion on a cracked foundation.

If we are going to incur public debt, we should prioritize resilience, distributed energy, grid security, and community-level reliability—not a gold-plated private infrastructure buildout for companies that already have trillion-dollar valuations.

Big Tech’s Growing Appetite—and Private Hoarding

This isn’t just a future problem. The data center buildout is already in full swing and your Uncle Sugar must be getting nervous about where he’s going to get the money from to run his AI and his autonomous drone weapons. In Oregon, where electricity is famously cheap thanks to the Bonneville Power Administration’s hydroelectric dams on the Columbia River, tech companies have quietly snapped up huge portions of the grid’s output. What was once a shared public benefit—affordable, renewable power—is now being monopolized by AI compute farms whose profits leave the region to the bank accounts in Silicon Valley.

Meanwhile, Microsoft is investing in a nuclear-powered data center next to the defunct Three Mile Island reactor—but again, it’s not about public benefit. It’s about keeping Azure’s training workloads running 24/7. And don’t expect them to share any of that power capacity with the public—or even with neighboring hospitals, schools, or communities.

Letting the Public Build Private Fortresses

The real play here isn’t just to use public power—it’s to get the public to build the power infrastructure, and then seal it off for proprietary use. Moats work both ways.

That includes:
– Publicly funded transmission lines across hundreds of miles to deliver power to remote server farms;
– Publicly subsidized generation capacity (nuclear, gas, solar, hydro—you name it);
– And potentially, prioritized access to the grid that lets AI workloads run while the rest of us face rolling blackouts during heatwaves.

All while tech giants don’t share their models, don’t open their training data, and don’t make their outputs public goods. It’s a privatized extractive model, powered by your tax dollars.

Been Burning for Decades

Don’t forget: Google and YouTube have already been burning massive amounts of electricity for 20 years. It didn’t start with ChatGPT or Gemini. Serving billions of search queries, video streams, and cloud storage events every day requires a permanent baseload—yet somehow this sudden “AI emergency” is being treated like a surprise, as if nobody saw it coming.

If they knew this was coming (and they did), why didn’t they build the power? Why didn’t they plan for sustainability? Why is the public now being told it’s our job to fix their bottleneck?

The Cold War Analogy—Flipped on Its Head

Some industry advocates argue that breaking up Big Tech or slowing AI infrastructure would be like disarming during a new Cold War with China. But Gail Slater, the Assistant Attorney General leading the DOJ’s Antitrust Division, pushed back forcefully—not at a hearing, but on the War Room podcast.

In that interview, Slater recalled how AT&T tried to frame its 1980s breakup as a national security threat, arguing it would hurt America’s Cold War posture. But the DOJ did it anyway—and it led to an explosion of innovation in wireless technology.

“AT&T said, ‘You can’t do this. We are a national champion. We are critical to this country’s success. We will lose the Cold War if you break up AT&T,’ in so many words. … Even so, [the DOJ] moved forward … America didn’t lose the Cold War, and … from that breakup came a lot of competition and innovation.”

“I learned that in order to compete against China, we need to be in all these global races the American way. And what I mean by that is we’ll never beat China by becoming more like China. China has national champions, they have a controlled economy, et cetera, et cetera.

We win all these races and history has taught by our free market system, by letting the ball rip, by letting companies compete, by innovating one another. And the reason why antitrust matters to that picture, to the free market system is because we’re the cop on the beat at the end of the day. We step in when competition is not working and we ensure that markets remain competitive.”

Slater’s message was clear: regulation and competition enforcement are not threats to national strength—they’re prerequisites to it. So there’s no way that the richest corporations in commercial history should be subsidized by the American taxpayer.

Bottom Line: It’s Public Risk, Private Reward

Let’s be clear:

– They want the public to bear the cost of new electricity generation.
– They want the public to underwrite transmission lines.
– They want the public to streamline regulatory hurdles.
– And they plan to privatize the upside, lock down the infrastructure, keep their models secret and socialize the investment risk.

This isn’t a public-private partnership. It’s a one-way extraction scheme. America needs a serious conversation about energy—but it shouldn’t begin with asking taxpayers to bail out the richest companies in commercial history.

Deduplication and Discovery: The Smoking Gun in the Machine

WINSTON

“Wipe up all those little pieces of brains and skull”

From Pulp Fiction, screenplay by Quentin Tarantino and Roger Avary

Deduplication—the process of removing identical or near-identical content from AI training data—is a critical yet often overlooked indicator that AI platforms actively monitor and curate their training sets. This is the kind of process that one would expect given the kind of “scrape, ready, aim” business practices that seems precisely the approach of AI platforms that have ready access to large amounts of fairly high quality data from users of other products placed into commerce by business affiliates or confederates of the AI platforms.

For example, Google Gemini could have access to gmail, YouTube, at least “publicly available” Google Docs, Google Translate, or Google for Education, and then of course one of the great scams of all time, Google Books. Microsoft uses Bing searches, MSN browsing, the consumer Copilot experience, and ad interactions. Amazon uses Alexa prompts, Facebook uses “public” posts and so on.

This kind of hoovering up of indiscriminate amounts of “data” in the form of your baby pictures posted on Facebook and your user generated content on YouTube is bound to produce duplicates. After all, how may users have posted their favorite Billie Eilish or Taylor Swift music video. AI doesn’t need 10000 versions of “Shake it Off” they probably just need the official video. Enter deduplication–which by definition means the platform knows what it has scraped and also knows what it wants to get rid of.

“Get rid of” is a relative concept. In many systems—particularly in storage environments like backup servers or object stores—deduplication means keeping only one physical copy of a file. Any other instances of that data don’t get stored again; instead, they’re represented by pointers to the original copy. This approach, known as inline deduplication, happens in real time and minimizes storage waste without actually deleting anything of functional value. It requires knowing what you have, knowing you have more than one version of the same thing, and being able to tell the system where to look to find the “original” copy without disturbing the process and burning compute inefficiently.

In other cases, such as post-process deduplication, the system stores data initially, then later scans for and eliminates redundancies. Again, the AI platform knows there are two or more versions of the same thing, say the book Being and Nothingness, knows where to find the copies and has been trained to keep only one version. Even here, the duplicates may not be permanently erased—they might be archived, versioned, or logged for auditing, compliance, or reconstruction purposes.

In AI training contexts, deduplication usually means removing redundant examples from the training set to avoid copyright risk. The duplicate content may be discarded from the training pipeline but often isn’t destroyed. Instead, AI companies may retain it in a separate filtered corpus or keep hashed fingerprints to ensure future models don’t retrain on the same material unknowingly.

So they know what they have, and likely know where it came from. They just don’t want to tell any plaintiffs.

Ultimately, deduplication is less about destruction and more about optimization. It’s a way to reduce noise, save resources, and improve performance—while still allowing systems to track, reference, or even rehydrate the original data if needed.

Its existence directly undermines claims that companies are unaware of which copyrighted works were ingested. Indeed, it only makes sense that one of the hidden consequences of the indiscriminate scraping that underpins large-scale AI training is the proliferation of duplicated data. Web crawlers ingest everything they can access—news articles republished across syndicates, forum posts echoed in aggregation sites, Wikipedia mirrors, boilerplate license terms, spammy SEO farms repeating the same language over and over. Without any filtering, this avalanche of redundant content floods the training pipeline.

This is where deduplication becomes not just useful, but essential. It’s the cleanup crew after a massive data land grab. The more messy and indiscriminate the scraping, the more aggressively the model must filter for quality, relevance, and uniqueness to avoid training inefficiencies or—worse—model behaviors that are skewed by repetition. If a model sees the same phrase or opinion thousands of times, it might assume it’s authoritative or universally accepted, even if it’s just a meme bouncing around low-quality content farms.

Deduplication is sort of the Winston Wolf of AI. And if the cleaner shows up, somebody had to order the cleanup. It is a direct response to the excesses of indiscriminate scraping. It’s both a technical fix and a quiet admission that the underlying data collection strategy is, by design, uncontrolled. But while the scraping may be uncontrolled to get copies of as much of your data has they can lay hands on, even by cleverly changing their terms of use boilerplate so they can do all this under the effluvia of legality, they send in the cleaner to take care of the crime scene.

So to summarize: To deduplicate, platforms must identify content-level matches (e.g., multiple copies of Being and Nothingness by Jean-Paul Sartre). This process requires tools that compare, fingerprint, or embed full documents—meaning the content is readable and classifiable–and, oh, yes, discoverable.

Platforms may choose the ‘cleanest’ copy to keep, showing knowledge and active decision-making about which version of a copyrighted work is retained. And–big finish–removing duplicates only makes sense if operators know which datasets they scraped and what those datasets contain.

Drilling down on a platform’s deduplication tools and practices may prove up knowledge and intent to a precise degree—contradicting arguments of plausible deniability in litigation. Johnny ate the cookies isn’t going to fly. There’s a market clearing level of record keeping necessary for deduping to work at all, so it’s likely that there are internal deduplication logs or tooling pipelines that are discoverable.

When AI platforms object to discovery about deduplication, plaintiffs can often overcome those objections by narrowing their focus. Rather than requesting broad details about how a model deduplicates its entire training set, plaintiffs should ask a simple, specific question: Were any of these known works—identified by title or author—deduplicated or excluded from training?

This approach avoids objections about overbreadth or burden. It reframes discovery as a factual inquiry, not a technical deep dive. If the platform claims the data was not retained, plaintiffs can ask for existing artifacts—like hash filters, logs, or manifests—or seek a sworn statement explaining the loss and when it occurred. That, in turn, opens the door to potential spoliation arguments.

If trade secrets are cited, plaintiffs can propose a protective order, limiting access to outside counsel or experts like we’ve done 100,000 times before in other cases. And if the defendant claims “duplicate” is too vague, plaintiffs can define it functionally—as content that’s identical or substantially similar, by hash, tokens, or vectors.

Most importantly, deduplication is relevant. If a platform identified a plaintiff’s work and trained on it anyway, that speaks to volitional use, copying, and lack of care—key issues in copyright and fair use analysis. And if they lied about it, particularly to the court—Helloooooo Harper & Row. Discovery requests that are focused, tailored, and anchored in specific works stand a far better chance of surviving objections and yielding meaningful evidence which hopefully will be useful and lead to other positive results.