Effective AI-driven vulnerability discovery does not require a restricted frontier model. This is a concrete run that shows it, start to finish: I pointed IronCurtain’s vuln-discovery workflow at QEMU, with every agent role driven by the open-weight GLM 5.2. The workflow had already found a set of out-of-bounds primitives in QEMU’s device emulation, the EDU device’s among them. I then set it the goal of turning those primitives into a maximal-impact proof of concept of host compromise: a reliable guest-to-host escape that needs no host-side knowledge, defeating ASLR from inside the guest and leaving a host-visible proof of execution, all against unmodified device source. It rejected two dead-end targets, pivoted to corrupting the EDU device’s own timer callback, and drove the chain to reproducible host code execution.
The EDU device is QEMU’s educational PCI device, not compiled by default and attached only with an explicit -device edu. It is tutorial code, present on no default machine and in no production build. Xchg Labs published the complete EDU escape in Escaping QEMU on May 22 and noted, correctly, that there is nothing to disclose, because the file exists to teach new contributors how device emulation works. So I can show this chain in full. Everything else the same investigation surfaced, in devices that ship in real configurations, stays withheld pending coordination, exactly as in my earlier work.
One model drove every role
The workflow is the same hub-and-spoke state machine I described before. An orchestrator that is not itself an LLM routes specialized agents (analyze, harness design and build, validate, discover, triage, conclude) through a fixed set of states, each rehydrating a fresh context window from the journal on disk. I will not re-explain the mechanics here. What matters for this post is the model wiring and one design rule.
The model wiring is deliberately boring. The workflow pins a single model alias for all eleven agent roles, with no per-state override, so one model drives analysis, harness design, fuzzing, exploitation, and triage uniformly. A LiteLLM gateway re-routed every model call to GLM 5.2, so GLM 5.2 served every role, and I changed nothing else in the workflow.
The rule that keeps the run honest is that nothing counts as a finding until execution proves it. A sanitizer hit, a crash, or a static match is a lead, not a result; the discover state has to demonstrate adversary-maximal impact by execution before the orchestrator routes it to triage. When discover does prove a vulnerability the orchestrator triages that one, and whether it then concludes or keeps going is set by the task it was given. That gate is why the run reads as real engineering rather than a lucky first hit: it was not allowed to stop at the first plausible crash, so it did not.
The escape, as it happened
The run is exploitation, not discovery. The bug was already in hand, and the open question was where to point it to prove impact. The EDU primitive reads and writes host memory relative to the device’s DMA buffer, so turning memory corruption into code execution meant finding a function pointer the write could reach and the program would later call. The workflow spent its first several rounds on the obvious answer, a pointer in a neighboring object, and rejected two candidates in turn. Each time it built a harness, ran it, and confirmed that the corrupting write landed in an isolated region with no reachable dereference. Neither dead-end was argued away on paper. Each was killed by an executed harness that failed to produce a usable hijack.
At round eight the orchestrator pivoted to the primitive it had not yet tried: the EDU device’s own memory. The insight, recorded in the discover state’s notes, is that the device’s embedded timer sits immediately before its DMA buffer in the same allocation:
QEMUTimer dma_timeris embedded inline inEduStateand sits immediately BEFOREchar dma_buf[], so edu’s OWNdma_timer.cbis backward ofdma_buf. A wrapped negative offset reachesdma_timer.cb. Corruptingdma_timer.cbthen re-arming and firing the timer calls the attacker-chosen gadget asts->cb(ts->opaque).
Instead of hunting for a neighbor with a usable function pointer, the workflow targeted the function pointer inside the overflowed object itself, which carries its own built-in, guest-paced dereference: the timer fire. That collapses the whole escape onto one self-contained, already-public bug. It is the elegant move, and it is the same one Xchg Labs landed on independently.
The arc did not jump straight to victory. The first run reached full control of the host instruction pointer by round eleven but aimed it at an address that was not mapped, so the process faulted on the instruction fetch. That is a meaningful milestone: program-counter control proven across ten of ten ASLR runs, but not yet a working call. A continuation (its task directive) closed the gap, characterizing the call site and selecting a real libc target, system, and by round thirteen the validator’s own execution produced host code execution. The two runs together span thirteen rounds, each agent building and running its own harnesses. My part was the task definition and the approvals, not the exploit engineering.
The bug and the exploit
QEMU’s EDU device exposes its DMA engine through BAR0 registers. The guest writes a source, a destination, and a count, then sets a command bit that arms a timer. When the timer fires, edu_dma_timer() runs the transfer. The bounds check on that path, edu_check_range(), returns on the in-bounds case and merely logs on the out-of-bounds one, with no clamp and no abort (hw/misc/edu.c:106-125, verbatim):
|
|
Its result is void, and both callers ignore it. The transfer then computes a host buffer offset by subtracting the DMA window base (edu_dma_timer, hw/misc/edu.c:149-161, verbatim):
|
|
dst and src are full 64-bit, guest-controlled values. After the log-only check, dst -= DMA_START underflows on uint64_t for any guest offset below DMA_START (0x40000), so edu->dma_buf + dst points before dma_buf for a backward reach and arbitrarily far after it for a forward one. The companion masking function clamps only the guest-side address against the device’s DMA mask. It never touches the host offset. The guest DMA target stays a legal guest physical address while the host pointer goes wild. One bug, read in two DMA directions, gives both an out-of-bounds read (a leak) and an out-of-bounds write.
What makes the timer-callback target reliable is the struct layout, which gdb resolved on the built object:
|
|
dma_timer.cb sits at a fixed -0x20 from dma_buf, and the object’s class pointer at -0xcb8, both inside the same allocation with no page boundary or sanitizer redzone between them. Every address the exploit needs is a fixed offset from dma_buf. It never needs an absolute host address, so it never reads /proc/self/maps or calls process_vm_readv on the exploit path. That is the “no host knowledge” constraint from the task, satisfied by the structure of the bug.
The exploit chain reads three static constants from the target’s glibc once, offline (the &main_arena+96 offset, the system offset, and an arena page offset), then runs entirely from guest-side operations:
- Leak and defeat ASLR. Sweep the out-of-bounds read forward of
dma_bufand pull host heap bytes into guest memory. Free glibc chunks store a pointer to&main_arena+96, so that qword recurs. Tallying canonical leaked values at the right page offset surfaces the arena pointer, which dominated the runner-up by roughly 380 to 1, and from itlibc_base, thensystem. The selector is statistical and maps-free. - Snapshot the device object backward of
dma_buf, preserving its fields. - Plant the command string into the front of the object and write the snapshot back, corrupting the class pointer but leaving the timer callback intact, so the device keeps working.
- Plant
systemontodma_timer.cbwith an eight-byte write atdst = DMA_START - 0x20, which underflows to-0x20and lands exactly on the callback. - Fire. Re-arm the timer and advance virtual time. The main loop calls
cb(opaque), nowsystem(EduState), andsystemreads its command from the bytes planted in step 3.
The workflow chose system over a return-oriented chain for a concrete reason it read off the call site. At the dereference, only rdi is attacker-controlled:
|
|
A return-oriented program needs control of the stack or a pivot, and the dump shows neither: rsp and rbp are the real stack. But system needs only rdi to point at a command string, and the guest fully owns the bytes at EduState, so the direct call is the whole exploit. The timing works out cleanly too. The corrupting fire has already cleared the run bit, and the timer machinery reads cb into a local before calling it, so overwriting cb mid-callback does not recurse. The next fire picks up cb = system.
The workflow does not take the chain on faith. A separate validator state re-runs the harness rather than trusting the discover state’s account, and its gdb capture shows system reached through the hijacked callback, inside QEMU’s own timer machinery:
|
|
And it reproduced across address-space randomization:
|
|
Ten of ten, with the marker file written by the host process each time. The device source was unmodified throughout. The proof-of-concept wrapper asserts a clean git status on the device files before each run, so the only thing driving the host process is guest-reachable register and DMA traffic against stock tutorial code. The point for this post is not that the callback target exists. It is that the workflow found a usable one only after execution rejected the easy ones, and that a separate validator, not the agent that found it, signed off. The workflow’s complete write-up of the finding records the chain, the call-site analysis, and the evidence in full.
The argument, made concrete
This is the part that matters for the larger argument. In my earlier run, a frontier model found a serious bug and then stopped when exploit development crossed its acceptable-use boundary. I had to break the work into a seven-step plan, and the model completed two steps before declining the rest. Here the same workflow ran from analysis through a working host exploit without a refusal, because the model’s policy is mine. For legitimate defensive research, that is the difference between a finding and a stall. A maintainer cannot rank a vulnerability from a static report, and most static findings are false positives; the maximal-impact proof of concept is what separates the real bug from the noise. Producing it is defensive work.
The target makes the point safe to make. EDU is tutorial code, already public, in no production build, so a full proof of concept against it costs no one anything. The capability to carry an exploit end to end now rides on an open-weight model that no vendor can refuse on your behalf, recall by export order, reprice after you depend on it, or silently degrade under a classifier you cannot inspect. A well-resourced adversary already works this way. A defender who depends on a frontier API accepts a policy and availability handicap the adversary does not.
Everything technical above is QEMU’s EDU device: tutorial code, attached only with an explicit -device edu, on no default machine and in no production build, and already published in full by Xchg Labs. There is no CVE and nothing to report upstream. The same investigation also turned up out-of-bounds read and write primitives in devices that do ship in real configurations; those were harder to drive to a full host exploit than the EDU chain, and they stay withheld while coordination runs. I keep this approach throughout this article: show the published, harmless case in full, and say nothing concrete about the rest until it is fixed.
The final result: an open-weight model, inside a trust boundary I control, drove a set of confirmed primitives to a reproducible proof-of-concept exploit with the workflow unchanged and the device source untouched. The artifacts track rounds and verdicts rather than wall-clock time, so the token meter is the best measure of effort: roughly 789 million tokens over five days for the whole QEMU investigation, the figure in the chart below, low enough at open-weight rates to make broad, repeated auditing routine.
The model still matters. It has to be good enough to form real hypotheses and engineer a chain, and GLM 5.2 is. Even so, the model found a working exploit, not the most elegant one: where it recovered the system address the long way, through a statistical sweep of heap pointers, Xchg Labs derived it straight from the code pointer the same overflow already leaked. On this device, a careful human still beat the harness-driven model to the cleaner line. What the workflow adds is long-horizon coherence and a refusal to cut corners, which is what lets a model that is not a restricted frontier model finish the job. GLM 5.1 already found vulnerabilities; GLM 5.2, more capable, finds them better. Both are open-weight, so the model that did this is one you can audit, pin, and keep. The earlier posts on orchestration, open weights, and structural invariants made the case; this run is the demonstration.
IronCurtain is open source. The workflow, the gateway shim, and the evidence gates are in the repository, and the run above used them unchanged.