• Home
  • Blog
  • MDASH & Mythos: Two Different Architectures. One Clear Direction.
mdash-mythos-two-different-architectures-one-clear-direction
MDASH & Mythos: Two Different Architectures. One Clear Direction.
5:13

Why the AI vulnerability discovery debate is really an architecture debate and what it means for how organisations should think about application security.

TL;DR MDASH and Mythos are not the same thing and are not direct competitors. One is a single frontier model; the other is a multi-agent orchestration system. The CyberGym benchmark results MDASH at 96.55%, Mythos Preview at 83.1% reflect an architectural choice, not a model race. Understanding the difference matters for how you plan your application security strategy.

As the AI security market moves quickly, one question comes up repeatedly: how does Microsoft's MDASH relate to Anthropic's Mythos? The answer is worth addressing directly, because the distinction shapes how organisations should think about AI-powered vulnerability discovery.

Two Different Approaches

Anthropic’s Claude Mythos

A single, highly capable foundation AI model. Anthropic built it to reason autonomously at an advanced level; discovering software vulnerabilities, chaining them into working exploits and operating with minimal scaffolding. Its strength is the depth of reasoning a frontier model brings to a single task.

MDASH (Microsoft's Multi-Model Agentic Scanning Harness)

Microsoft takes a fundamentally different approach, it is not a model at all; it is an orchestration layer that coordinates more than 100 specialised AI agents across multiple models simultaneously. Agents scan source code, debate findings with one another, validate exploitability and filter noise before anything surfaces to a security team.

The Key Architectural Difference

Mythos concentrates capability in one model. MDASH distributes it across a configurable ensemble, using the right model for each task: heavyweight reasoning models where depth matters, lighter models where speed and volume matter.

What the Benchmark Results Actually Tell Us

On the CyberGym benchmark developed by UC Berkeley researchers and covering 1,507 real-world vulnerability reproduction tasks across 188 open-source projects the scores showed:

These numbers are self-reported by the respective companies; no independent third party has verified them. That caveat noted, the gap is significant and the direction it points is clear.

MDASH did not win because Microsoft has a more powerful model. Microsoft does not. MDASH uses publicly available models. It won because the orchestration system around those models; how agents are directed, how findings are debated, how exploitability is validated, outperformed any single model working alone.

“The benchmark results confirm what we are seeing in practice. The scaffolding you build around a model; how you direct it, validate its findings and integrate it into real security workflows matters as much as the model itself. MDASH demonstrates that multi-agent orchestration is a serious operational capability, not a research experiment.”
— Anthony Byrne, Head of Offensive Security, CyberOne

 

Why the Architecture Question Matters for Enterprise Security

For security leaders evaluating AI-powered vulnerability discovery, the architectural difference has practical consequences.

A single frontier model approach concentrates risk in one dependency. If that model changes, is restricted or is unavailable, the capability changes with it. A multi-model orchestration approach is model-agnostic by design: swap or update individual models within the ensemble without rebuilding the system.

There is also the question of how findings are validated, in a single-model approach, the model both identifies and assesses the finding. In MDASH's multi-agent approach, agents actively debate and challenge each other's findings before surfacing them, which is why MDASH achieved zero false positives in private testing against 21 deliberately planted vulnerabilities.

For DevSecOps teams operating at scale, the signal-to-noise ratio matters as much as raw detection capability. Confirmed, exploitable findings that developers can act on immediately are more valuable than high-volume alerts requiring manual triage.

The Direction This Points To

The MDASH vs Mythos comparison is not really about which vendor is ahead on a benchmark today. It is about which architectural model scales for enterprise application security over the next three to five years.

CyberOne's view: multi-agent orchestration is where enterprise application security is heading. Not because single frontier models lack capability, but because enterprise environments require flexibility, auditability and the ability to integrate security into existing development workflow requirements that a composable, multi-model system is better positioned to meet.

As an MDASH Engaged Partner, CyberOne is working directly with Microsoft to help enterprise customers navigate this transition, understanding the architecture, assessing their readiness and building the operational foundations to benefit from continuous, AI-powered vulnerability discovery.

Find out what an AI-assisted attacker would discover before they do, contact CyberOne to arrange your AI Attack Readiness Assessment.

 

Share this post

Related Articles