What the UK’s Data Bill *Really* Means for AI and Copyright
Introduction
You might have seen some noisy headlines this week shouting that the UK has “legalised AI stealing content” or “dealt a blow to creators.” While the frustration behind those reactions is understandable, the reality is more complex — and less immediate.
So let’s clear it up. The Data (Use and Access) Bill, which has now passed both Houses of Parliament and awaits royal assent, does not change UK copyright law. It does not introduce new rules around AI model training, transparency, or opt-out systems.
But because of some proposed amendments that didn’t make it in — and the media’s tendency to conflate parallel policy debates — many are misunderstanding what this bill actually does.
Let’s separate the facts from the fear.
What the Data (Use and Access) Bill Actually Does
The bill is about data use, data sharing, and public sector reform. Its primary aim is to:
- Modernise the UK’s data framework post-Brexit,
- Enable better use of data in healthcare, research, and public services,
- Create a more innovation-friendly environment for the UK’s digital economy.
It touches on AI indirectly — not because it was designed to regulate AI training, but because members of the House of Lords tried to insert provisions that would have done just that.
Those Lords' amendments included:
- A requirement for AI developers to disclose which copyrighted works were used for training,
- An obligation to seek licensing or explicit permission from rights holders.
These were repeatedly rejected by the Commons. Why? The government argued that this kind of change was too complex to rush into a general data bill, and should instead be tackled in future, dedicated AI legislation.
So the final version of the bill includes none of those copyright-related amendments. That’s key.
The Separate Track: Copyright and AI
Here’s where most of the media confusion is coming from.
The UK government is already running a separate consultation on how AI and copyright should interact. That process:
- Closed in February 2025,
- Received input from developers, creators, lawyers, and publishers,
- Is expected to lead to a separate AI bill, likely not arriving until 2026.
This is where opt-out systems, transparency rules, and training restrictions might eventually come into play — but none of that is law yet.
In other words, the AI copyright story is still being written.
What’s the Legal Position Right Now?
As of today:
- UK copyright law still applies to AI training. Using copyrighted material without permission may still be a breach, depending on how the content is used.
- There is no formal opt-out mechanism for creators who don’t want their work used in training data — though the government has said it supports exploring one.
- There is no new exception introduced by the Data Bill.
- The government has committed to publishing reports on AI and copyright impact, which may inform future legislation — but again, that’s not binding or law.
This means that any AI model training today still sits in a legal grey area, shaped more by interpretation, precedent, and pending litigation (like Getty v. Stability AI) than by statutory law.
Why This Misinformation Matters
There’s a real danger here: if creators, artists, or even small businesses take headlines at face value, they may believe their rights have already been removed or diminished. But they haven’t.
And if developers or tech companies think this bill gives them a green light to scrape and train freely, they could find themselves in legal hot water later.
This bill doesn’t change the law — but the public misunderstanding of it could.
That’s why we need to be careful with our words. And it’s why I’m writing this: to help clear up what this bill is, and just as importantly, what it isn’t.
Education Matters — Because AI Doesn’t “Unlearn”
Here’s the other side of this discussion that doesn’t get enough attention: once an AI model is trained on your work, you can’t really get it back out.
LLMs (like GPT or Claude or Gemini) don’t work like a filing cabinet where you can just delete a document. Training is a mathematical optimisation process — the model learns patterns and statistical relationships, not direct memory of individual documents. But if your work influenced those patterns, it’s in there.
And once it’s in there? The only way to remove it is:
- Retrain the model from scratch without that data, which is expensive and almost never done, or
- Try to filter the model’s outputs so it doesn’t talk about the data — but that doesn’t mean it hasn’t learned from it.
This is why education is so important — for developers, users, regulators, and especially creators. We need to understand what these models can and can’t do, how they learn, and what responsible use looks like.
If creators aren’t aware of what it means to “train on your content,” they can’t protect their rights effectively. And if developers don’t consider ethical implications early, they risk building products on shaky legal and moral ground.
Summary Table
Topic | Current Status |
---|---|
AI training on copyrighted works | Still subject to UK copyright law; no opt-out system in place |
Data (Use and Access) Bill | Passed without any changes to copyright or AI model training |
Lords' amendments | Rejected by Commons; seen as premature and burdensome |
Separate consultation on AI/copyright | Closed Feb 2025; government report pending |
Future AI Bill | Expected (possibly 2026); will address these issues directly |
Final Thoughts: What To Watch For
This bill is now law (pending royal assent), but it’s not the decisive moment for AI copyright reform. That’s still to come.
We’re in a holding pattern, waiting for:
- The outcome of the copyright and AI consultation,
- The formal reports the government has committed to publishing (within 6–9 months),
- Future legislation specifically designed to handle AI’s impact on intellectual property.
Until then, the current copyright framework remains in place — and both developers and creators should tread carefully, informed by facts rather than headlines.
TL;DR?
The Data (Use and Access) Bill is not a copyright reform. It’s a data access bill that deliberately avoided jumping the gun on AI copyright issues. Let’s not conflate a real legislative process with an imagined one just because the headlines are louder than the footnotes.