Mythos, Daybreak, and the Cyber AI Arms Race

In the space of about six weeks, the two biggest names in frontier AI have both shipped a cybersecurity model. Anthropic’s Mythos is the one that has been making headlines since April, with claims that it has found vulnerabilities across every major operating system and web browser. OpenAI’s Daybreak, which arrived on 11 May 2026, is the response, built on three flavours of GPT-5.5 and a long list of integration partners (Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks and Zscaler).

If you only read the launch blogs, the implication is clear. AI is now finding software flaws that humans missed, and the next few years are going to be a race between offensive and defensive AI to get to those flaws first.

That makes for a great press release.

It is also not, quite, what the working cybersecurity community is saying.

What both models actually do

Both Mythos and Daybreak are doing roughly the same kind of work, just from opposite ends of the stack.

Mythos was launched as part of Anthropic’s Project Glasswing initiative, in which a small group of security teams and government partners were given access to a frontier model tuned for vulnerability discovery. Anthropic’s own research blog frames it as a model that can read large codebases, model attacker behaviour, and identify exploitable conditions in ways that previous models could not.

Daybreak is structured as three models. GPT-5.5 with general-purpose safeguards. GPT-5.5 with Trusted Access for Cyber, for verified defensive work in authorised environments. And GPT-5.5-Cyber, a more permissive variant for red teaming, penetration testing and controlled validation. OpenAI is selling Daybreak less as “a model” and more as a programme, with Codex Security sitting underneath to do the heavy lifting on repository-level threat modelling.

Both are real. Both are useful. Both, on the available evidence, are an improvement on what came before.

What the experts are actually saying

This is the bit that has been getting lost.

Several of the most respected voices in the cybersecurity research community have spent April and May raising the same point. The headline claim, that these models are finding vulnerabilities humans and older tooling could not have found, is not, on the whole, holding up under independent scrutiny.

Scientific American ran a careful piece summing it up neatly: “Every cybersecurity defender should take Mythos seriously, but the expected harm to defense is likely to be far lower than the worst-case scenarios would suggest.” Their interviewees are clear that the capability matters. They are equally clear that the most alarming claims need verification.

Rest of World reported that researchers in the United States and United Kingdom characterise Mythos as a meaningful improvement on previous frontier models for identifying vulnerabilities. They also note that the practical impact on the wider information security picture is contested.

The Centre for Emerging Technology and Security at the Alan Turing Institute went further in their briefing on Claude Mythos, pointing out that the real shift is in economics rather than discovery. AI is changing the cost of running vulnerability research at scale. Low-signal findings are becoming easier to produce, cheaper to generate, and harder for software maintainers to triage. High-value vulnerabilities are moving in the opposite direction.

And we have already covered, in an earlier newsletter, the AISLE security firm finding that much cheaper general-purpose models were able to find roughly the same class of bugs Mythos was being credited with. The capability gap, in other words, may be narrower than the marketing suggests.

You can see why this is the awkward part of the story. It is much easier to write a headline about “AI finding zero-days no human ever spotted” than it is to write one that says “AI is doing things humans were already doing, but faster and at scale, and that matters in different ways than you might think”.

Why it still matters, even if the worst case is not the right case

Here is where I want to be careful. The reason I think this is worth a longer piece is not that the cyber AI story is overblown. The reason is the opposite. The story is more important than the headline version, and pretending otherwise does the industry a disservice.

A few things are true at once.

These models are good. They are getting better. The economics of vulnerability research are moving in a direction that favours organisations that can deploy AI at scale, on both sides of the offence-defence line.

The most realistic threat is not a model that finds something nobody else could find. It is a model that finds things older tools and skilled humans could find, but does so cheaply enough to apply at internet scale, in places where defenders are not looking, against software that is rarely audited.

The most realistic upside is the mirror of that. Defenders who have access to these models and the discipline to use them get a meaningful uplift on the work they were already doing. Daybreak’s partner list is not random. Those are the firms whose customer-facing security depends on doing the work in production.

The story is not “AI changes everything overnight”. The story is “AI changes the economics of attack and defence at the same time, and the side that is better organised wins more often”.

What this means for the rest of us

Most readers of this newsletter are not running a CISO function or commissioning a penetration test. So why does any of this matter for the rest of us?

A few reasons.

If you are buying software, the security posture of the vendors you depend on is going to start mattering more, not less. The pace at which they can find and fix issues in their own code, and the use they make of these models internally, is part of the security you are paying for whether you read it on the contract or not. Ask the question.

If you are using AI in your own business, the same logic applies in miniature. The agents you wire up touch real systems. The code your AI tools help you write touches real users. The fact that there is a new class of model dedicated to finding holes in software ought to be a quiet prompt to look at your own AI hygiene. The thing about the 1Password partnership with OpenAI Codex, where secrets are issued just in time and never sit inside the model’s context window, is a sensible response to exactly this shift.

And if you are a leader in a regulated business, this is the moment to ask your security team how they are tracking the cyber-AI capability curve. Not because the sky is falling. Because the people who are going to deal with this well are the ones who started doing the boring foundational work before the headlines forced it.

The honest summary

Mythos and Daybreak are not the AI cyber apocalypse. They are also not nothing. They are powerful new tools that change the economics of a field that already runs at the edge of what humans can keep up with.

The cybersecurity community is, broadly, sceptical about the most dramatic claims and serious about the underlying trend. That is the right combination. We should hold the same position.

If you remember one thing from this piece, make it the AISLE point. The most useful AI for cybersecurity may turn out to be the cheapest general-purpose models, deployed widely and patiently by defenders who know what they are doing, rather than the most expensive frontier models doing the same job in a press release.

The next few years of this story are going to be quiet and structural rather than loud and dramatic. Watch your vendors. Watch your AI hygiene. And do not be surprised when the bit that ends up mattering is not the bit that makes the headlines.

I discussed Mythos and Daybreak on Chapter 14 of Prompt Fiction, our podcast on what is actually happening in AI.