Anthropic limits access to Mythos, its new cybersecurity AI model

It's not a "Cybersecurity AI model", it's an LLM.

I know you're referencing the FT, but the story is misleading. Mythos is a new general purpose LLM, that does everything --tells jokes, writes essays, solves math problems, writes code, etc.

It just happens that this current iteration is so good at writing and analyzing code that it is able to hack anything right now. At some point, if they can make it safe, the model is going to be released to the public, and it will be a big deal for entirely different reasons.
 
Last edited:
Upvote
79 (100 / -21)

WereCatf

Ars Tribunus Militum
2,883
Mythos has been in use with partners for several weeks. Although it is a “general purpose” model with wider capabilities, it is the first time the company has limited release of a model, due to its capabilities in cyber security.

Anthropic said the software can identify cyber vulnerabilities at a scale beyond human capacity but it could also develop ways to exploit these vulnerabilities, which bad actors could use. The company said the model could “reshape” cyber security practices and does not plan a broad release.
It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.
 
Upvote
119 (150 / -31)

solomonrex

Ars Legatus Legionis
13,545
Subscriptor++
It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.
It IS a powerful tool, it DOES summarize vast quantities of free content online that CAN find vulnerabilities and reply to detailed questions about them, and it CAN write code.

So, YES, it IS a marketing trick, but NO this isn't nothing. It's a real threat to open source software, for one thing. Software companies will become much more secretive to protect business - and their own jobs.
 
Upvote
-11 (34 / -45)

peterford

Ars Praefectus
4,286
Subscriptor++
This is both fantastic marketing and a potential foretaste of what could happen when/if these models get better than the average coder.

Given that Claude Code is extremely popular (no personal usage myself) it's clear that the models have increased their capability over the past few years and it's past reasonable to claim that improvements are going to suddenly stop. That's aside from moral or environmental considerations.
 
Upvote
33 (44 / -11)
Post content hidden for low score. Show…

rfboi

Seniorius Lurkius
4
Has anyone found these alleged posts or verified which websites the LLM posted on? This sounds like a lot of marketing BS. I’d appreciate more skepticism on this overall story. While some of the details are verifiable and noteworthy, namely that the model is good at finding exploits, the rest seems a salacious story intended to build hype.
 
Upvote
52 (64 / -12)

LeoRed

Wise, Aged Ars Veteran
137
It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.
 
Upvote
81 (84 / -3)

J.D.M

Wise, Aged Ars Veteran
176
At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the internet—and posted details of its workaround online.
Anthropic acknowledged it demonstrated “a potentially dangerous capability for circumventing [the company’s] safeguards.”

Ummm... yeah.
And Cisco and CrowdStrike are onboard with this?
 
Upvote
14 (15 / -1)

Dachannien

Ars Scholae Palatinae
1,155
Subscriptor
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.
Even more concerning is the idea that we're that much closer to someone setting up a system that fully automates finding vulnerabilities and exploiting them. By which I mean, a human sets up the system but the system does everything after that. Since the "exploiting" part of the system is a black box, it could potentially set up its own compute and automate itself in another environment. And even if the original human realized something had gone wrong and shut their own system down, you'd still have the rogue copy floating around out there, potentially even replicating itself elsewhere.
 
Upvote
22 (26 / -4)
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.
Even better: In 6-12 months, it will be 1/10th the price for comparable capabilities in an open weights model where people can post-train to remove alignment and AI safety guardrails
 
Upvote
48 (50 / -2)
It IS a powerful tool, it DOES summarize vast quantities of free content online that CAN find vulnerabilities and reply to detailed questions about them, and it CAN write code.

So, YES, it IS a marketing trick, but NO this isn't nothing. It's a real threat to open source software, for one thing. Software companies will become much more secretive to protect business - and their own jobs.
If their claims hold up, becoming more secretive will not help. From Anthropic’s writeup:
We’ve used these capabilities to find vulnerabilities and exploits in closed-source browsers and operating systems. We have been able to use it to find, for example, remote DoS attacks that could remotely take down servers, firmware vulnerabilities that let us root smartphones, and local privilege escalation exploit chains on desktop operating systems.
If you want more details, Anthropic’s writeup has them. As I mentioned above, I am skeptical of Anthropic’s claims, and of AI claims in general, but I have also talked with cybersecurity researchers using other LLM models in a similar fashion and in some cases they are getting very good results. AI and LLMs are very much overhyped, but that doesn’t mean that there are no niche use cases where they might be useful.
 
Upvote
61 (62 / -1)
At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the Internet—and posted details of its workaround online.

… Jesus. Is this where we are now? That was fast.
This story is a bit of an exaggeration. It was instructed to find its way out of the sandbox. It wasn't told to write an internet post about it online, but it's not that much of a leap.
 
Last edited:
Upvote
33 (35 / -2)

stdaro

Ars Scholae Palatinae
718
This is just leveling the playing field with the NSA/CIA and the sophisticated non-state organized cyber-crime collectives. they probably already have all the zero-days that mythos is finding. If past experience holds, the vast majority of these will be silly bugs in non-load bearing code, but a few will be RCE or privilege escalation in widely deployed tools.
 
Upvote
11 (15 / -4)
They explicitly say they will not be doing that in the article, twice.
They are not releasing "Mythos Preview". Presumably they will release Mythos eventually.

here is their full text:
We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview.
It is interesting that they are planning on releasing an in-between level model first.
 
Upvote
16 (16 / 0)

motytrah

Ars Tribunus Militum
2,972
Subscriptor++
It's not a "Cybersecurity AI model", it's an LLM.

I know you're referencing the FT, but the story is misleading. Mythos is a new general purpose LLM, that does everything --tells jokes, writes essays, solves math problems, writes code, etc.

It just happens that this current iteration is so good at writing and analyzing code that it is able to hack anything right now. Soon though, the model is going to be released to the public, and it will be a big deal for entirely different reasons.
That's generally bad news as I don't think it's going to be that difficult for someone else to optimize a model to hack.
 
Upvote
5 (5 / 0)

Dumb Svengali

Ars Scholae Palatinae
653
I am still an AI-skeptic. Significantly so. I think it's error rate will lead to pretty large harms, and it is already causing cognitive decay at a significant scale - not to mention the mass scale disinfo happening. Yes, there's also a lot of marketing at work here - some of the "woah it's so scary" is to hype it up. Additionally, the economics for the companies are pretty crazy. Finally, I do not think AGI is a reasonable outcome here.

That said, I no longer think it's viable to oppose LLMs on the grounds of "it's actually fancy auto-complete and totally useless in every context and any reports on what it is actually good for are nonsense hype". I think that ship has sailed. That doesn't mean I'm pro-LLM.

As someone who did political comms for a long time - if you want to be persusasive that we should be more worried about the harms of LLMs, you CANNOT just run around saying things that don't align with people's experiences. More and more people are trying and using LLMs. If your argument is "they do nothing and don't work and are 100% trash", people will just dismiss what you are saying out of hand. They've seen it work for them, and will dismiss what you are saying out of hand, rather than decide their own experience with/impression of LLMs was fake or wrong. Just a tip for the fellow skeptics out here.
 
Upvote
76 (80 / -4)
Ummm... yeah.
And Cisco and CrowdStrike are onboard with this?
I don't think they have much choice. This isn't going to be unique to Anthropic. In a few months, every LLM in the world will be able to find the same vulnerabilities and write the same exploits.

If what they're saying is true (big if), then we need to use LLMs to find bugs and vulnerabilities today, so that we are not hacked by LLMs tomorrow. It really is a genie out of the bottle situation.
 
Last edited:
Upvote
39 (39 / 0)
That's generally bad news as I don't think it's going to be that difficult for someone else to optimize a model to hack.
They are saying they won't release it until it's safe. That said, it's bad news in as much as there will be more models that are just as capable. And yes, according to them, it takes no skill to use.
 
Upvote
8 (8 / 0)

Fatesrider

Ars Legatus Legionis
25,295
Subscriptor
Its new model, Claude Mythos Preview, would be available only to vetted organizations, including Broadcom, Cisco, and CrowdStrike, Anthropic said on Tuesday. The company added it was also in discussions with the US government about its use.
Okay, another Ai thing that's supposed to be secure. Ill bet they;'re very careful about how they handle their data and such.
The announcement follows a data leak by the San Francisco start-up last month, when descriptions of the Mythos model and other documents were discovered in a publicly accessible data cache.
Well, you know, things like that CAN happen, but it's not a good look. I'll bet they locked things down super tight after that...

Last week, Anthropic suffered a second incident, leading to the internal source code for its personal assistant, Claude Code, being made public.
🤦‍♂️
 
Upvote
13 (14 / -1)

Snackasaurus

Smack-Fu Master, in training
96
I am still an AI-skeptic. Significantly so. I think it's error rate will lead to pretty large harms, and it is already causing cognitive decay at a significant scale - not to mention the mass scale disinfo happening. Yes, there's also a lot of marketing at work here - some of the "woah it's so scary" is to hype it up. Additionally, the economics for the companies are pretty crazy. Finally, I do not think AGI is a reasonable outcome here.

That said, I no longer think it's viable to oppose LLMs on the grounds of "it's actually fancy auto-complete and totally useless in every context and any reports on what it is actually good for are nonsense hype". I think that ship has sailed. That doesn't mean I'm pro-LLM.

As someone who did political comms for a long time - if you want to be persusasive that we should be more worried about the harms of LLMs, you CANNOT just run around saying things that don't align with people's experiences. More and more people are trying and using LLMs. If your argument is "they do nothing and don't work and are 100% trash", people will just dismiss what you are saying out of hand. They've seen it work for them, and will dismiss what you are saying out of hand, rather than decide their own experience with/impression of LLMs was fake or wrong. Just a tip for the fellow skeptics out here.
Yeah, there's a jokey meme response to LLMs that just sees them as stochastic mad libs. That view of them is rooted in 2024 to mid 2025 LLMs. Things have qualitatively shifted since then. You can still be opposed to them, you just need to understand how quickly the sands are shifting.
 
Upvote
16 (21 / -5)

dsync

Wise, Aged Ars Veteran
191
In recent weeks, Mythos has identified thousands of so-called zero-day—previously undiscovered—vulnerabilities and other security flaws, many of which are critical and have persisted for a decade or more.


In one example, it found a 16-year-old flaw in widely used video software, in a line of code that automated testing tools had executed 5 million times without detecting the issue.

I am not an AI-skeptic at all, but this is marketing.

Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.

Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".

Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.


Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...
 
Upvote
19 (27 / -8)

Snackasaurus

Smack-Fu Master, in training
96
I am not an AI-skeptic at all, but this is marketing.

Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.

Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".

Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.


Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...
The Tom's Hardware article has a lot more info:

As those same researchers tell it, current versions of Claude are able to identify vulnerabilities well, but usually fail miserably at the task of turning those vulnerabilities into active exploits. Mythos, by contrast, is able to turn a whopping 72.4% of vulnerabilities it identifies into sucessful exploits within the domain of Firefox's JavaScript shell, and it is able to achieve register control in a further 11.6% of attempted attacks.

Anthropic's Frontier Red Team extensively describes the threat that an unbridled Mythos release might have on an unsuspecting software industry, and one example of its internal benchmarking practices vividly illustrates what's at stake: "We regularly run our models against roughly a thousand open source repositories from the OSS-Fuzz corpus, and grade the worst crash they can produce on a five-tier ladder of increasing severity, ranging from basic crashes (tier 1) to complete control flow hijack (tier 5).

With one run on each of roughly 7000 entry points into these repositories, Sonnet 4.6 and Opus 4.6 reached tier 1 in between 150 and 175 cases, and tier 2 about 100 times, but each achieved only a single crash at tier 3. In contrast, Mythos Preview achieved 595 crashes at tiers 1 and 2, added a handful of crashes at tiers 3 and 4, and achieved full control flow hijack on ten separate, fully patched targets (tier 5)."
https://www.tomshardware.com/tech-i...-fix-critical-bugs-some-unpatched-for-decades
 
Upvote
46 (46 / 0)
They say it found thousands of 0-day vulnerabilities. If that's true (and we've seen enough to know that LLM's will come up with alleged vulnerabilities that are currently flooding bug bounty programs with slop), then whether the "it's too good to give to the public" thing is marketing or not isn't particularly relevant. Walking between two buildings on a tightrope might be a publicity stunt, but that doesn't mean you couldn't have actually fallen off and died.
 
Upvote
4 (5 / -1)
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.

At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.
To be fair, money was always a key factor. If you can outspend in order to get better programmers than your opponent, you've already won the first battle. Sure, there's always the risk of some rando poking around and finding (or causing) issues, but the big boys are playing on a whole different level. When you can literally print money, there is no level of sophistication that can withstand that kind of pressure. Shit, if you aren't getting anywhere with your hackers, just bribe the person who created it!
 
Upvote
5 (5 / 0)

bugsbony

Ars Scholae Palatinae
1,050
I am not an AI-skeptic at all, but this is marketing.

Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.

Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".

Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.


Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...
As someone posted above, anthropic long technical post https://red.anthropic.com/2026/mythos-preview/ (which really should have been linked to in the article) contains many details on a few bugs, and hashes for bugs to be revealed.
 
Upvote
14 (14 / 0)
Yeah, there's a jokey meme response to LLMs that just sees them as stochastic mad libs. That view of them is rooted in 2024 to mid 2025 LLMs. Things have qualitatively shifted since then. You can still be opposed to them, you just need to understand how quickly the sands are shifting.
"Stochastic Parrot" was from 2021, before the invention of 'Chat-GPT'. I think it was pretty accurate at the time.
 
Upvote
4 (6 / -2)
Post content hidden for low score. Show…

lolware

Ars Praetorian
590
Subscriptor
Digital Magic 8 Ball (LLM) says, "YES".
This cliché is getting really old. Claude Code with Opus 4.6 is demonstrably impressive.

It is a socio-environmental distaster, it has flaws that experienced engineers should keep in check, it’s certainly not AGI or Einstein, but dismissing it as a coin toss is a lazy form of denial.

And industrializing the identification of security flaws, even if it is wrong 90% of the time, is a serious threat.
 
Upvote
54 (54 / 0)
Post content hidden for low score. Show…