The Day After the Zero-Days

Wed, 20 May 2026 00:00:00 +0000

Around August 1996 I traded private email with Wietse Venema about a TCP source-routing spoofing attack I had demonstrated against the BSD r-services trust model. Wietse later left a note in fix_options.c recording the exchange: “I discussed this attack with Niels Provos, half a year before the attack was described in open mailing lists.” The Secure Networks Inc. advisory followed in February 1997, credited to me through Oliver Friedrichs. That advisory is the first hard timestamp on my work in vulnerability research, now thirty years out from May 2026.

In November 1998 I committed the OpenBSD TCP SACK implementation. It contained a signed-integer comparison bug that, given the right sequence numbers, drove a NULL dereference and panicked the kernel. The bug remained in production for twenty-seven years. The OpenBSD codebase is unusually scrutinized, both by the project and by every system that ships from it. I had reviewed the code, committed it, and forgotten about it.

On April 7, 2026, twelve years to the day after the public disclosure of OpenSSL Heartbleed, Anthropic’s Mythos agent surfaced the bug. As Activ8te, we marked the same anniversary with a Dance Pop Love song called Heartbleed.

Thirty years finding security vulnerabilities. Twenty-seven years quietly creating one. The second number is the more useful data point about what careful human review will and will not catch.

In 1996, reading the source of Wietse Venema’s TCP Wrappers, I noticed it stripped loose source-routing options before the r-services daemons ever saw them, leaving those daemons unable to refuse a spoofed connection. I built the attack, reached out to Wietse, and later demonstrated it live at HIP'97. Finding it depended on one person reading the right code. Thirty years ago that was the shape of vulnerability discovery; today an AI harness does the same work autonomously. Mythos ran on the kernel sources and surfaced the bug. I reproduced the result independently with my open-source IronCurtain framework against Opus 4.7, Sonnet 4.6, and Z.AI’s open-weight GLM 5.1. My previous post walks through the orchestration mechanics in detail. Vulnerability discovery is an orchestration problem, not a frontier-model problem.

IronCurtain is open source. The capability is neither a defender’s secret nor an attacker’s. Any motivated adversary already has the equivalent or will assemble one in a quarter. The right baseline assumption is parity on capability. Plan from there.

A workflow that does not require a frontier model

For readers who skipped the previous post, the discipline that closes the gap with frontier capability is a finite-state machine. An orchestrator that is not itself an LLM routes specialized agents through prescribed states: analyze, hypothesize, build harness, validate, triage. Each state begins with a fresh context window rehydrated from an append-only journal on disk. States emit verdicts from a fixed set, so a model cannot end an investigation by declaring “looks fine” through a subtly wrong path; the machine refuses the verdict.

flowchart TB analyzeNode[analyze] orchestratorNode[orchestrator] harnessNode["harness pipeline"] discoverNode[discover] triageNode[triage] escalationNode["human escalation"] concludeNode[conclude] analyzeNode --> orchestratorNode orchestratorNode -- reanalyze --> analyzeNode orchestratorNode -- harness_design --> harnessNode orchestratorNode -- discover --> discoverNode orchestratorNode -- triage --> triageNode orchestratorNode -- escalate --> escalationNode orchestratorNode -- complete --> concludeNode

Every transition out of the orchestrator is a verdict from a closed set. Bounded loops carry visit caps; a stall forces human escalation rather than a model-led “looks fine” finish.

The workflow does not depend on access to frontier weights. Open-weight GLM 5.1, served from Z.AI, drove autonomous discovery on a foundational library with no manual steering. The model is large and does not run on a laptop, but anyone who can pay for an API endpoint or rent a multi-GPU instance can drive it. Per-audit cost runs in the tens to low hundreds of dollars, low enough that every dependency in a production stack is in scope for routine review.

Why patch-cadence defense is failing

The standard CISO posture for the last decade has been to stand up a vulnerability-management program, layer in third-party SaaS for posture management and threat detection, and treat that combination as the working defense. Each CVE that lands triggers a triage cycle and a remediation ticket. The cycle never empties.

This model held together for as long as the rate at which bugs surfaced was bounded by the human labor available to find them. AI has removed that bound. Discovery now scales with API credit. Patch consumption does not. The window between disclosure and weaponization keeps shrinking; the window between patch availability and patch propagation through downstream consumers does not.

A second failure mode is structural rather than temporal. Vulnerability management is reactive by design. It chases bugs the world has already discovered, without eliminating the attack surface that produced them or stacking the defense in depth that would make the next one less consequential. Most CISOs I have spoken with know this. The constraint is rarely the analysis. The constraints are budget, executive priority, and access to engineers who can build proactive controls. Those engineers were, for most of the last decade, in tight supply and routed by economic gravity into a handful of large technology companies. A CISO without that talent and without budget to attract it can fund a vendor and a triage queue. The proactive work stalls, and when an incident lands, the CISO is the named person on the postmortem.

That is the working environment patch-cadence defense actually operates in. It collapses under a discovery-side capability shift this large.

Security invariants

The response is not to find bugs faster. It is to build infrastructure that takes attack classes off the critical path of ongoing human security decisions.

I have been developing this framing on securityblueprints.io and it has held up across enough engagements that I am willing to put weight on it. A security invariant is a machine-enforced constraint applied consistently across an infrastructure. It impedes one or more steps of an attack kill chain without requiring a per-incident human security judgment. The bug may still exist. What changes is whether an attacker can complete the chain. Invert the question: instead of asking which bugs an adversary will find, ask what an adversary can do after finding any bug.

In a companion analysis of public breach disclosures, I found that three machine-enforced invariants applied consistently would have impeded a majority of incidents in the dataset, on the order of sixty-five percent. The details and methodology live on securityblueprints. Most environments lack these controls because the talent and capital to build them were previously scarce.

Hardware 2FA replaces shared-secret authentication with cryptographic proof of possession of a registered device. It removes credential phishing as a viable initial-access vector. Google deployed this internally beginning around 2010 after Operation Aurora and has reported zero successful credential-phishing compromises of an employee since. The example is fifteen years old. Most enterprises still run on shared-secret authentication, because the rollout disrupts every legacy integration that was built on the assumption that passwords are sufficient.

Egress control means default-deny outbound network traffic from production. An attacker who compromises an externally exposed service and gains code execution on a host is usually not done; the meaningful damage follows the arrival of a second-stage payload and the establishment of command-and-control. Egress control removes that step. The log4j campaigns of December 2021 are the canonical illustration. Every initial-access exploitation that mattered relied on the compromised host reaching outbound to fetch a next stage. A production environment that cannot make arbitrary outbound calls renders the entire vulnerability class less consequential, irrespective of whether the underlying log4j bug ever gets patched on that particular host.

Positive execution control means only allowlisted binaries are permitted to run. Social-engineering attacks where an attacker poses as IT support and asks a target to download and execute a “diagnostic tool” become structurally impossible. The target’s machine refuses the binary regardless of how convincing the pretext was. Microsoft has made the same observation around Smart App Control; several large enterprises have built equivalent allowlists internally for production workloads.

None of these are recent or novel. The argument is that they work, they have worked for over a decade in the places that adopted them, and they remove a substantial share of the breaches in the public record.

The three controls above are network and identity invariants. Stronger ones exist both below and above this layer.

At the hardware layer, memory tagging at allocation time gives the OS and runtime a structural defense against heap-class memory-safety bugs. ARM’s Memory Tagging Extension ships on Pixel devices today. Apple announced Memory Integrity Enforcement on iOS at WWDC. Given an MTE-capable platform and a tag-aware allocator, heap overflows that previously produced silent corruption now fault at the moment of misuse. CHERI is the research-grade extension of the same idea applied at the capability level. The framing of memory tagging as an invariant, and the observation that the hardware refresh cycle may turn over faster than the software rewrite cycle, I owe to Dino Dai Zovi. A device fleet can absorb MTE on the timescale of phone refresh cycles; rewriting the world’s C and C++ in safer languages will take considerably longer.

At the application and data layer, Context-Aware Data Access (CADA) is the strongest invariant we have developed against insider risk. The model is simple to state. Ambient access to sensitive data and systems is removed. Every read or operation requires a contemporaneous, verifiable business justification, such as an assigned support ticket. A separate system validates that the justification exists and is current. A compromised operator account, or a malicious operator, cannot pull data or access systems without a real-world reason another system can independently check.

The concrete shape of this is third-party customer-support vendors. Historically those vendors received powerful tools and broad read access to a customer database, on the theory that they would only use the access when they needed to. Context-Aware Data Access scopes them to the customer record attached to the ticket they were assigned. They cannot browse or pivot. The practical effect is that breach exposure from any single compromised vendor account collapses from the entire customer base to a single record. The cost of compromise stops scaling with the size of the company.

The July 2020 Twitter administrative-tool compromise is the canonical illustration of the failure mode this invariant prevents. Attackers social-engineered access to internal administrative tooling and used it to hijack over 130 high-profile accounts for a Bitcoin scam. The blast radius was set by the scope of the tool itself, which could touch any user record. Context-Aware Data Access breaks that coupling. An account with administrative privilege can only act on the records currently attached to a valid ticket, and the set of such records is small at any moment.

What changed

A major hurdle has always been cost. Designing, deploying, and maintaining invariants required specialized engineers who were in scarce supply and getting recruited into the same handful of large companies that already had the controls in place. The vendor market did not fill the gap. It sells detection, not invariants.

AI shifts the equation. Deploying egress control used to require a few senior platform-security engineers to survey the environment, design the control, roll it out incrementally, and lock it down. With the right spec, a single AI-assisted engineer can do most of that work. I could write such a spec, walking the engineer from a production-traffic survey through enforcement-substrate selection, monitor-mode deployment, exception-list generation, and progressive denial, phase by phase. Maintenance remains, but the budget and talent barrier that kept invariants out of most organizations has fallen.

The day after

Discovery is no longer constrained the way it was thirty years ago. The same workflow that rediscovered my own twenty-seven-year-old SACK bug is open source, runs against arbitrary code, and operates without acceptable-use friction in the hands of an adversary.

There is no winning this race on patch cadence. The path forward is to make the bug class irrelevant. Vulnerability management continues, at a fraction of the operational weight it carries today. The infrastructure disrupts the kill chain regardless of patch timing.

Build the invariants. Hardware 2FA. Default-deny egress. Allowlisted execution. Memory tagging at the hardware layer. Context-aware data access at the data layer. None is new. All are within reach of a single engineer with AI in the loop. The day after the next zero-day, the only thing that will matter is whether you built them in time.