It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.Mythos has been in use with partners for several weeks. Although it is a “general purpose” model with wider capabilities, it is the first time the company has limited release of a model, due to its capabilities in cyber security.
Anthropic said the software can identify cyber vulnerabilities at a scale beyond human capacity but it could also develop ways to exploit these vulnerabilities, which bad actors could use. The company said the model could “reshape” cyber security practices and does not plan a broad release.
It IS a powerful tool, it DOES summarize vast quantities of free content online that CAN find vulnerabilities and reply to detailed questions about them, and it CAN write code.It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.
Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.It's just a marketing trick: they're trying to hype it up by trying to portray it as something so powerful that one should even be somewhat afraid of its capabilities. It's an age-old tactic.
At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the internet—and posted details of its workaround online.
Anthropic acknowledged it demonstrated “a potentially dangerous capability for circumventing [the company’s] safeguards.”
… Jesus. Is this where we are now? That was fast.At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the Internet—and posted details of its workaround online.
Even more concerning is the idea that we're that much closer to someone setting up a system that fully automates finding vulnerabilities and exploiting them. By which I mean, a human sets up the system but the system does everything after that. Since the "exploiting" part of the system is a black box, it could potentially set up its own compute and automate itself in another environment. And even if the original human realized something had gone wrong and shut their own system down, you'd still have the rogue copy floating around out there, potentially even replicating itself elsewhere.Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.
At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.
Even better: In 6-12 months, it will be 1/10th the price for comparable capabilities in an open weights model where people can post-train to remove alignment and AI safety guardrailsMaybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.
At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.
If their claims hold up, becoming more secretive will not help. From Anthropic’s writeup:It IS a powerful tool, it DOES summarize vast quantities of free content online that CAN find vulnerabilities and reply to detailed questions about them, and it CAN write code.
So, YES, it IS a marketing trick, but NO this isn't nothing. It's a real threat to open source software, for one thing. Software companies will become much more secretive to protect business - and their own jobs.
If you want more details, Anthropic’s writeup has them. As I mentioned above, I am skeptical of Anthropic’s claims, and of AI claims in general, but I have also talked with cybersecurity researchers using other LLM models in a similar fashion and in some cases they are getting very good results. AI and LLMs are very much overhyped, but that doesn’t mean that there are no niche use cases where they might be useful.We’ve used these capabilities to find vulnerabilities and exploits in closed-source browsers and operating systems. We have been able to use it to find, for example, remote DoS attacks that could remotely take down servers, firmware vulnerabilities that let us root smartphones, and local privilege escalation exploit chains on desktop operating systems.
This story is a bit of an exaggeration. It was instructed to find its way out of the sandbox. It wasn't told to write an internet post about it online, but it's not that much of a leap.At one point, Anthropic found that it had escaped its so-called sandbox environment—designed to prevent it from accessing the Internet—and posted details of its workaround online.
… Jesus. Is this where we are now? That was fast.
They explicitly say they will not be doing that in the article, twice.Soon though, the model is going to be released to the public, and it will be a big deal for entirely different reasons.
They are not releasing "Mythos Preview". Presumably they will release Mythos eventually.They explicitly say they will not be doing that in the article, twice.
It is interesting that they are planning on releasing an in-between level model first.We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. To do so, we need to make progress in developing cybersecurity (and other) safeguards that detect and block the model’s most dangerous outputs. We plan to launch new safeguards with an upcoming Claude Opus model, allowing us to improve and refine them with a model that does not pose the same level of risk as Mythos Preview.
That's generally bad news as I don't think it's going to be that difficult for someone else to optimize a model to hack.It's not a "Cybersecurity AI model", it's an LLM.
I know you're referencing the FT, but the story is misleading. Mythos is a new general purpose LLM, that does everything --tells jokes, writes essays, solves math problems, writes code, etc.
It just happens that this current iteration is so good at writing and analyzing code that it is able to hack anything right now. Soon though, the model is going to be released to the public, and it will be a big deal for entirely different reasons.
I don't think they have much choice. This isn't going to be unique to Anthropic. In a few months, every LLM in the world will be able to find the same vulnerabilities and write the same exploits.Ummm... yeah.
And Cisco and CrowdStrike are onboard with this?
They are saying they won't release it until it's safe. That said, it's bad news in as much as there will be more models that are just as capable. And yes, according to them, it takes no skill to use.That's generally bad news as I don't think it's going to be that difficult for someone else to optimize a model to hack.
Okay, another Ai thing that's supposed to be secure. Ill bet they;'re very careful about how they handle their data and such.Its new model, Claude Mythos Preview, would be available only to vetted organizations, including Broadcom, Cisco, and CrowdStrike, Anthropic said on Tuesday. The company added it was also in discussions with the US government about its use.
Well, you know, things like that CAN happen, but it's not a good look. I'll bet they locked things down super tight after that...The announcement follows a data leak by the San Francisco start-up last month, when descriptions of the Mythos model and other documents were discovered in a publicly accessible data cache.
Last week, Anthropic suffered a second incident, leading to the internal source code for its personal assistant, Claude Code, being made public.
Yeah, there's a jokey meme response to LLMs that just sees them as stochastic mad libs. That view of them is rooted in 2024 to mid 2025 LLMs. Things have qualitatively shifted since then. You can still be opposed to them, you just need to understand how quickly the sands are shifting.I am still an AI-skeptic. Significantly so. I think it's error rate will lead to pretty large harms, and it is already causing cognitive decay at a significant scale - not to mention the mass scale disinfo happening. Yes, there's also a lot of marketing at work here - some of the "woah it's so scary" is to hype it up. Additionally, the economics for the companies are pretty crazy. Finally, I do not think AGI is a reasonable outcome here.
That said, I no longer think it's viable to oppose LLMs on the grounds of "it's actually fancy auto-complete and totally useless in every context and any reports on what it is actually good for are nonsense hype". I think that ship has sailed. That doesn't mean I'm pro-LLM.
As someone who did political comms for a long time - if you want to be persusasive that we should be more worried about the harms of LLMs, you CANNOT just run around saying things that don't align with people's experiences. More and more people are trying and using LLMs. If your argument is "they do nothing and don't work and are 100% trash", people will just dismiss what you are saying out of hand. They've seen it work for them, and will dismiss what you are saying out of hand, rather than decide their own experience with/impression of LLMs was fake or wrong. Just a tip for the fellow skeptics out here.
In recent weeks, Mythos has identified thousands of so-called zero-day—previously undiscovered—vulnerabilities and other security flaws, many of which are critical and have persisted for a decade or more.
In one example, it found a 16-year-old flaw in widely used video software, in a line of code that automated testing tools had executed 5 million times without detecting the issue.
The Tom's Hardware article has a lot more info:I am not an AI-skeptic at all, but this is marketing.
Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.
Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".
Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.
Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...
https://www.tomshardware.com/tech-i...-fix-critical-bugs-some-unpatched-for-decadesAs those same researchers tell it, current versions of Claude are able to identify vulnerabilities well, but usually fail miserably at the task of turning those vulnerabilities into active exploits. Mythos, by contrast, is able to turn a whopping 72.4% of vulnerabilities it identifies into sucessful exploits within the domain of Firefox's JavaScript shell, and it is able to achieve register control in a further 11.6% of attempted attacks.
Anthropic's Frontier Red Team extensively describes the threat that an unbridled Mythos release might have on an unsuspecting software industry, and one example of its internal benchmarking practices vividly illustrates what's at stake: "We regularly run our models against roughly a thousand open source repositories from the OSS-Fuzz corpus, and grade the worst crash they can produce on a five-tier ladder of increasing severity, ranging from basic crashes (tier 1) to complete control flow hijack (tier 5).
With one run on each of roughly 7000 entry points into these repositories, Sonnet 4.6 and Opus 4.6 reached tier 1 in between 150 and 175 cases, and tier 2 about 100 times, but each achieved only a single crash at tier 3. In contrast, Mythos Preview achieved 595 crashes at tiers 1 and 2, added a handful of crashes at tiers 3 and 4, and achieved full control flow hijack on ten separate, fully patched targets (tier 5)."
I got bad news for you about what is in Montana... That's where we put most of the nuclear silos lolIt’s when it escapes the sandbox, but intentionally doesn’t tell you about it, that’s when I move to Montana.
To be fair, money was always a key factor. If you can outspend in order to get better programmers than your opponent, you've already won the first battle. Sure, there's always the risk of some rando poking around and finding (or causing) issues, but the big boys are playing on a whole different level. When you can literally print money, there is no level of sophistication that can withstand that kind of pressure. Shit, if you aren't getting anywhere with your hackers, just bribe the person who created it!Maybe so, but do have a read at the technical post linked by poke 532810 above. As a developer of a web-facing product, it is a bit scary to think that with a few thousand $ of API spend people might be able to find bugs and potentially get control of my service, while before today you needed to have substantial computer science and cybersecurity knowledge to do so.
At least script kiddies had to have some kind of technical knowledge. Now the blocker to be a H4ck3r is money, and not even that much, apparently.
As someone posted above, anthropic long technical post https://red.anthropic.com/2026/mythos-preview/ (which really should have been linked to in the article) contains many details on a few bugs, and hashes for bugs to be revealed.I am not an AI-skeptic at all, but this is marketing.
Vulnerabilities are tricky to quanitfy from code alone. There are thousands of "zero-day" that are widely detectable, but generally aren't exploitable and don't necessarily sit on interfaces that have unsanitized input.
Models doing code-audits in isolation tend to find lots of vulnerabilities because they lack execution context and have trouble keeping sanitization in model context memory. So every double free or whatever becomes "look at the zero day we generated!".
Where these things tend to become really useful in vuln-dev toolchains is automating and guiding other processes that do static analysis, memory analysis, memory leak detection, guided SE analysis, guided fuzzing etc. etc. etc.
Vuln-dev as a field is simultaniously considerably more advanced than people think, as rule-based systems, and quantitative ML has been used for years prior to LLM's to guide some of these processes anyway -- and is so low-level that a lot of it is still "I typed a lot of A's in an interface and 0x41 showed up where it shouldn't."...
"Stochastic Parrot" was from 2021, before the invention of 'Chat-GPT'. I think it was pretty accurate at the time.Yeah, there's a jokey meme response to LLMs that just sees them as stochastic mad libs. That view of them is rooted in 2024 to mid 2025 LLMs. Things have qualitatively shifted since then. You can still be opposed to them, you just need to understand how quickly the sands are shifting.
Are you going to be a dental floss tycoon?It’s when it escapes the sandbox, but intentionally doesn’t tell you about it, that’s when I move to Montana.
For system cracking? I’d guess CIA, NSA, and FBI, for starters.I thought the US Gov and Anthropic were on the outs because Baby Trump Hegseth had a hissy fit.
So who in the US Gov is engaging with Anthropic?
This cliché is getting really old. Claude Code with Opus 4.6 is demonstrably impressive.Digital Magic 8 Ball (LLM) says, "YES".