Anthropic limits access to Mythos, its new cybersecurity AI model

internetomancer · Apr 8, 2026

It's not a "Cybersecurity AI model", it's an LLM.

I know you're referencing the FT, but the story is misleading. Mythos is a new general purpose LLM, that does everything --tells jokes, writes essays, solves math problems, writes code, etc.

It just happens that this current iteration is so good at writing and analyzing code that it is able to hack anything right now. At some point, if they can make it safe, the model is going to be released to the public, and it will be a big deal for entirely different reasons.

WereCatf · Apr 8, 2026

Mythos has been in use with partners for several weeks. Although it is a “general purpose” model with wider capabilities, it is the first time the company has limited release of a model, due to its capabilities in cyber security.

Anthropic said the software can identify cyber vulnerabilities at a scale beyond human capacity but it could also develop ways to exploit these vulnerabilities, which bad actors could use. The company said the model could “reshape” cyber security practices and does not plan a broad release.

It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.

poke 532810 · Apr 8, 2026

Anthropic’s writeup on the Claude Mythos Preview model is at https://red.anthropic.com/2026/mythos-preview/. Their claims sound impressive, but we’ll see how much of that is typical AI hype and how much is real substance.

solomonrex · Apr 8, 2026

WereCatf said:
It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.

It IS a powerful tool, it DOES summarize vast quantities of free content online that CAN find vulnerabilities and reply to detailed questions about them, and it CAN write code.

So, YES, it IS a marketing trick, but NO this isn't nothing. It's a real threat to open source software, for one thing. Software companies will become much more secretive to protect business - and their own jobs.

peterford · Apr 8, 2026

This is both fantastic marketing and a potential foretaste of what could happen when/if these models get better than the average coder.

Given that Claude Code is extremely popular (no personal usage myself) it's clear that the models have increased their capability over the past few years and it's past reasonable to claim that improvements are going to suddenly stop. That's aside from moral or environmental considerations.

Tactical Finesse · Apr 8, 2026

Digital Magic 8 Ball (LLM) says, "YES".

rfboi · Apr 8, 2026

Has anyone found these alleged posts or verified which websites the LLM posted on? This sounds like a lot of marketing BS. I’d appreciate more skepticism on this overall story. While some of the details are verifiable and noteworthy, namely that the model is good at finding exploits, the rest seems a salacious story intended to build hype.

LeoRed · Apr 8, 2026

WereCatf said:
It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.

Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.

J.D.M · Apr 8, 2026

At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the internet—and posted details of its workaround online.
Anthropic acknowledged it demonstrated “a potentially dangerous capability for circumventing [the company’s] safeguards.”

Ummm... yeah.
And Cisco and CrowdStrike are onboard with this?

wrylachlan · Apr 8, 2026

At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the Internet—and posted details of its workaround online.

… Jesus. Is this where we are now? That was fast.

Dachannien · Apr 8, 2026

LeoRed said:
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.

Even more concerning is the idea that we're that much closer to someone setting up a system that fully automates finding vulnerabilities and exploiting them. By which I mean, a human sets up the system but the system does everything after that. Since the "exploiting" part of the system is a black box, it could potentially set up its own compute and automate itself in another environment. And even if the original human realized something had gone wrong and shut their own system down, you'd still have the rogue copy floating around out there, potentially even replicating itself elsewhere.

House of Propane · Apr 8, 2026

LeoRed said:
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.

Even better: In 6-12 months, it will be 1/10th the price for comparable capabilities in an open weights model where people can post-train to remove alignment and AI safety guardrails

poke 532810 · Apr 8, 2026

solomonrex said:
It IS a powerful tool, it DOES summarize vast quantities of free content online that CAN find vulnerabilities and reply to detailed questions about them, and it CAN write code.

So, YES, it IS a marketing trick, but NO this isn't nothing. It's a real threat to open source software, for one thing. Software companies will become much more secretive to protect business - and their own jobs.

If their claims hold up, becoming more secretive will not help. From Anthropic’s writeup:

We’ve used these capabilities to find vulnerabilities and exploits in closed-source browsers and operating systems. We have been able to use it to find, for example, remote DoS attacks that could remotely take down servers, firmware vulnerabilities that let us root smartphones, and local privilege escalation exploit chains on desktop operating systems.

If you want more details, Anthropic’s writeup has them. As I mentioned above, I am skeptical of Anthropic’s claims, and of AI claims in general, but I have also talked with cybersecurity researchers using other LLM models in a similar fashion and in some cases they are getting very good results. AI and LLMs are very much overhyped, but that doesn’t mean that there are no niche use cases where they might be useful.

internetomancer · Apr 8, 2026

wrylachlan said:
At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the Internet—and posted details of its workaround online.

Click to expand...

… Jesus. Is this where we are now? That was fast.

This story is a bit of an exaggeration. It was instructed to find its way out of the sandbox. It wasn't told to write an internet post about it online, but it's not that much of a leap.

Selethorme · Apr 8, 2026

internetomancer said:
Soon though, the model is going to be released to the public, and it will be a big deal for entirely different reasons.

They explicitly say they will not be doing that in the article, twice.

Aelix · Apr 8, 2026

It’s when it escapes the sandbox, but intentionally doesn’t tell you about it, that’s when I move to Montana.

stdaro · Apr 8, 2026

This is just leveling the playing field with the NSA/CIA and the sophisticated non-state organized cyber-crime collectives. they probably already have all the zero-days that mythos is finding. If past experience holds, the vast majority of these will be silly bugs in non-load bearing code, but a few will be RCE or privilege escalation in widely deployed tools.

internetomancer · Apr 8, 2026

Selethorme said:
They explicitly say they will not be doing that in the article, twice.

They are not releasing "Mythos Preview". Presumably they will release Mythos eventually.

here is their full text:

We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview.

It is interesting that they are planning on releasing an in-between level model first.

hephep · Apr 8, 2026

[Ars median poster]: Its all hype about something that will take all our job by writing useless code, with a side job of killing us all as nothing but a fancy word prediction algorithm.

Did I get that right?

motytrah · Apr 8, 2026

internetomancer said:
It's not a "Cybersecurity AI model", it's an LLM.

I know you're referencing the FT, but the story is misleading. Mythos is a new general purpose LLM, that does everything --tells jokes, writes essays, solves math problems, writes code, etc.

It just happens that this current iteration is so good at writing and analyzing code that it is able to hack anything right now. Soon though, the model is going to be released to the public, and it will be a big deal for entirely different reasons.

That's generally bad news as I don't think it's going to be that difficult for someone else to optimize a model to hack.

Dumb Svengali · Apr 8, 2026

I am still an AI-skeptic. Significantly so. I think it's error rate will lead to pretty large harms, and it is already causing cognitive decay at a significant scale - not to mention the mass scale disinfo happening. Yes, there's also a lot of marketing at work here - some of the "woah it's so scary" is to hype it up. Additionally, the economics for the companies are pretty crazy. Finally, I do not think AGI is a reasonable outcome here.

That said, I no longer think it's viable to oppose LLMs on the grounds of "it's actually fancy auto-complete and totally useless in every context and any reports on what it is actually good for are nonsense hype". I think that ship has sailed. That doesn't mean I'm pro-LLM.

As someone who did political comms for a long time - if you want to be persusasive that we should be more worried about the harms of LLMs, you CANNOT just run around saying things that don't align with people's experiences. More and more people are trying and using LLMs. If your argument is "they do nothing and don't work and are 100% trash", people will just dismiss what you are saying out of hand. They've seen it work for them, and will dismiss what you are saying out of hand, rather than decide their own experience with/impression of LLMs was fake or wrong. Just a tip for the fellow skeptics out here.

internetomancer · Apr 8, 2026

J.D.M said:
Ummm... yeah.
And Cisco and CrowdStrike are onboard with this?

I don't think they have much choice. This isn't going to be unique to Anthropic. In a few months, every LLM in the world will be able to find the same vulnerabilities and write the same exploits.

If what they're saying is true (big if), then we need to use LLMs to find bugs and vulnerabilities today, so that we are not hacked by LLMs tomorrow. It really is a genie out of the bottle situation.

internetomancer · Apr 8, 2026

motytrah said:
That's generally bad news as I don't think it's going to be that difficult for someone else to optimize a model to hack.

They are saying they won't release it until it's safe. That said, it's bad news in as much as there will be more models that are just as capable. And yes, according to them, it takes no skill to use.

Fatesrider · Apr 8, 2026

Its new model, Claude Mythos Preview, would be available only to vetted organizations, including Broadcom, Cisco, and CrowdStrike, Anthropic said on Tuesday. The company added it was also in discussions with the US government about its use.

Okay, another Ai thing that's supposed to be secure. Ill bet they;'re very careful about how they handle their data and such.

The announcement follows a data leak by the San Francisco start-up last month, when descriptions of the Mythos model and other documents were discovered in a publicly accessible data cache.

Well, you know, things like that CAN happen, but it's not a good look. I'll bet they locked things down super tight after that...

Last week, Anthropic suffered a second incident, leading to the internal source code for its personal assistant, Claude Code, being made public.

Snackasaurus · Apr 8, 2026

Dumb Svengali said:
I am still an AI-skeptic. Significantly so. I think it's error rate will lead to pretty large harms, and it is already causing cognitive decay at a significant scale - not to mention the mass scale disinfo happening. Yes, there's also a lot of marketing at work here - some of the "woah it's so scary" is to hype it up. Additionally, the economics for the companies are pretty crazy. Finally, I do not think AGI is a reasonable outcome here.

That said, I no longer think it's viable to oppose LLMs on the grounds of "it's actually fancy auto-complete and totally useless in every context and any reports on what it is actually good for are nonsense hype". I think that ship has sailed. That doesn't mean I'm pro-LLM.

As someone who did political comms for a long time - if you want to be persusasive that we should be more worried about the harms of LLMs, you CANNOT just run around saying things that don't align with people's experiences. More and more people are trying and using LLMs. If your argument is "they do nothing and don't work and are 100% trash", people will just dismiss what you are saying out of hand. They've seen it work for them, and will dismiss what you are saying out of hand, rather than decide their own experience with/impression of LLMs was fake or wrong. Just a tip for the fellow skeptics out here.

Yeah, there's a jokey meme response to LLMs that just sees them as stochastic mad libs. That view of them is rooted in 2024 to mid 2025 LLMs. Things have qualitatively shifted since then. You can still be opposed to them, you just need to understand how quickly the sands are shifting.

dsync · Apr 8, 2026

In recent weeks, Mythos has identified thousands of so-called zero-day—previously undiscovered—vulnerabilities and other security flaws, many of which are critical and have persisted for a decade or more.

In one example, it found a 16-year-old flaw in widely used video software, in a line of code that automated testing tools had executed 5 million times without detecting the issue.

I am not an AI-skeptic at all, but this is marketing.

Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.

Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".

Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.

Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...

Snackasaurus · Apr 8, 2026

dsync said:
I am not an AI-skeptic at all, but this is marketing.

Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.

Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".

Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.

Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...

The Tom's Hardware article has a lot more info:

As those same researchers tell it, current versions of Claude are able to identify vulnerabilities well, but usually fail miserably at the task of turning those vulnerabilities into active exploits. Mythos, by contrast, is able to turn a whopping 72.4% of vulnerabilities it identifies into sucessful exploits within the domain of Firefox's JavaScript shell, and it is able to achieve register control in a further 11.6% of attempted attacks.

Anthropic's Frontier Red Team extensively describes the threat that an unbridled Mythos release might have on an unsuspecting software industry, and one example of its internal benchmarking practices vividly illustrates what's at stake: "We regularly run our models against roughly a thousand open source repositories from the OSS-Fuzz corpus, and grade the worst crash they can produce on a five-tier ladder of increasing severity, ranging from basic crashes (tier 1) to complete control flow hijack (tier 5).

With one run on each of roughly 7000 entry points into these repositories, Sonnet 4.6 and Opus 4.6 reached tier 1 in between 150 and 175 cases, and tier 2 about 100 times, but each achieved only a single crash at tier 3. In contrast, Mythos Preview achieved 595 crashes at tiers 1 and 2, added a handful of crashes at tiers 3 and 4, and achieved full control flow hijack on ten separate, fully patched targets (tier 5)."

https://www.tomshardware.com/tech-i...-fix-critical-bugs-some-unpatched-for-decades

bungalowbernard · Apr 8, 2026

They say it found thousands of 0-day vulnerabilities. If that's true (and we've seen enough to know that LLM's will come up with alleged vulnerabilities that are currently flooding bug bounty programs with slop), then whether the "it's too good to give to the public" thing is marketing or not isn't particularly relevant. Walking between two buildings on a tightrope might be a publicity stunt, but that doesn't mean you couldn't have actually fallen off and died.

TimeToTilt · Apr 8, 2026

Aelix said:
It’s when it escapes the sandbox, but intentionally doesn’t tell you about it, that’s when I move to Montana.

I got bad news for you about what is in Montana... That's where we put most of the nuclear silos lol

BigEddieD · Apr 8, 2026

LeoRed said:
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.

To be fair, money was always a key factor. If you can outspend in order to get better programmers than your opponent, you've already won the first battle. Sure, there's always the risk of some rando poking around and finding (or causing) issues, but the big boys are playing on a whole different level. When you can literally print money, there is no level of sophistication that can withstand that kind of pressure. Shit, if you aren't getting anywhere with your hackers, just bribe the person who created it!

bugsbony · Apr 8, 2026

dsync said:
I am not an AI-skeptic at all, but this is marketing.

Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.

Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".

Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.

Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...

As someone posted above, anthropic long technical post https://red.anthropic.com/2026/mythos-preview/ (which really should have been linked to in the article) contains many details on a few bugs, and hashes for bugs to be revealed.

internetomancer · Apr 8, 2026

Snackasaurus said:
Yeah, there's a jokey meme response to LLMs that just sees them as stochastic mad libs. That view of them is rooted in 2024 to mid 2025 LLMs. Things have qualitatively shifted since then. You can still be opposed to them, you just need to understand how quickly the sands are shifting.

"Stochastic Parrot" was from 2021, before the invention of 'Chat-GPT'. I think it was pretty accurate at the time.

redbeardthepirate · Apr 8, 2026

I thought the US Gov and Anthropic were on the outs because Baby Trump Hegseth had a hissy fit.

So who in the US Gov is engaging with Anthropic?

PBG4 Dude · Apr 8, 2026

Aelix said:
It’s when it escapes the sandbox, but intentionally doesn’t tell you about it, that’s when I move to Montana.

Are you going to be a dental floss tycoon?

PBG4 Dude · Apr 8, 2026

redbeardthepirate said:
I thought the US Gov and Anthropic were on the outs because Baby Trump Hegseth had a hissy fit.

So who in the US Gov is engaging with Anthropic?

For system cracking? I’d guess CIA, NSA, and FBI, for starters.

lolware · Apr 8, 2026

Tactical Finesse said:
Digital Magic 8 Ball (LLM) says, "YES".

This cliché is getting really old. Claude Code with Opus 4.6 is demonstrably impressive.

It is a socio-environmental distaster, it has flaws that experienced engineers should keep in check, it’s certainly not AGI or Einstein, but dismissing it as a coin toss is a lazy form of denial.

And industrializing the identification of security flaws, even if it is wrong 90% of the time, is a serious threat.

Anthropic limits access to Mythos, its new cybersecurity AI model

Ars Tribunus Militum

Ars Tribunus Militum

Ars Praetorian

Ars Legatus Legionis

Ars Praefectus

Wise, Aged Ars Veteran

Seniorius Lurkius

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Praetorian

Ars Tribunus Militum

Ars Praetorian

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Tribunus Militum

Ars Legatus Legionis

Smack-Fu Master, in training

Wise, Aged Ars Veteran

Smack-Fu Master, in training

Ars Praetorian

Ars Tribunus Militum

Ars Centurion

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Centurion

Wise, Aged Ars Veteran

Wise, Aged Ars Veteran

Ars Praetorian