AHXproject

Ubuntu => Ubuntu Blog => Topic started by: tim on May 21, 2026, 12:53 AM

Title: Finding the blind spot: How Canonical hunts logic flaws with AI
Post by: tim on May 21, 2026, 12:53 AM
Finding the blind spot: How Canonical hunts logic flaws with AI

The recent unveiling of Anthropic's Claude Mythos preview (https://red.anthropic.com/2026/mythos-preview/)  has radically shifted the cybersecurity landscape. We are now in an era where AI can autonomously discover and exploit zero-day vulnerabilities in mature codebases at machine speed. Perhaps the most exciting revelation from the Mythos preview was the demonstration that frontier models can now successfully reason about complex, domain-specific business logic bugs – a class of vulnerabilities historically reserved for human security researchers.

Earlier this year, I began developing an internal AI-powered auditing agent called Redhound to proactively hunt for these exact blind spots. Built on frontier models, Redhound puts that reasoning to work against our own codebases at Canonical.

Redhound has already proven its value, recently uncovering three critical logic vulnerabilities in LXD, our container and virtual machine manager. These bugs had survived years of manual review and static analysis. Redhound found them in under a day of unsupervised analysis.

Below, I break down the mechanics of this adversarial pipeline, the technical details of the three zero-days (now patched and disclosed), and how agentic auditing changes the way we secure infrastructure.

The bugs that fall through every other tool

Static analysis handles pattern-matching problems well: injection sinks, unsafe API calls, and dangerous concatenations. Modern scanners were built to find these problems, and they do that work reliably.

What these scanners cannot do is reason about what is missing: for example, a checklist that names three fields when the data structure has four; or a validation that reads one file while the operation it gates uses a different one. These are not sloppy code errors; they are exploitable gaps in code that reads correctly to a careful reviewer. Because the line that would close the gap does not exist in the source, a tool looking for patterns has nothing to match against.

Dynamic analysis and fuzzing fail for a related reason: they need a runtime signal – a crash, a panic, a sanitizer trip. A request that should have been denied but succeeds looks identical to a legitimate one. There is nothing for the fuzzer to trip on.

Manual review and penetration testing catch these bugs, but the work is time-consuming and demands substantial domain expertise. Finding the vulnerabilities by hand means combing through hundreds of thousands of lines that are correct, waiting to notice the one that isn't. Mature codebases survive years of this and still ship logic bugs.

These are the bugs Redhound goes after: the code does exactly what it was written to do, but that does not map to the intent of the security model.

How Redhound works

Redhound audits our codebases the way a determined human attacker would: reading a project end-to-end, generating adversarial hypotheses, dispatching agents to investigate each one, and running a separate round of agents to refute them.

The pipeline runs in five conceptual phases:


Only findings that survive the debunker reach a human reviewer. Redhound then generates a draft report and a runnable proof-of-concept (PoC) exploit to streamline the validation process.

Three classes of bug

The three findings are a useful sample because they represent three different classes of logic flaws. All three were assigned a final CVSS 3.1 score of 9.1 during coordinated disclosure.

VulnerabilityCWE / ClassAttacker gainsWhy hard to findCertificate type escalation (CVE-2026-34179)CWE-915 (mass assignment)Restricted certificate user to host rootA missing authorization check – no pattern marks what is not thereVM low-level option bypass (CVE-2026-34177)CWE-184 (incomplete denylist)Restricted project user to host rootAn unlisted key is indistinguishable from an intentionally permitted oneBackup restore desync (CVE-2026-34178)CWE-20 (improper input validation)Restricted project user to host rootTwo data flows from one input diverge across four files

Each finding below shows what Redhound actually produced: the structured metadata, the title verbatim, and the concrete trace generated by the investigator agent.

Certificate type escalation (CVE-2026-34179)

This flaw resides in the certificate update logic where the system fails to validate the certificate "type". A restricted certificate user can effectively grant themselves Cluster Admin privileges by bypassing type checks during a certificate update.

Finding details:


Exploitation trace on LXD 6.7 (https://github.com/canonical/lxd/tree/lxd-6.7)  (eight steps, generated by the investigator):


Also produced for this finding: full code-location evidence and a debunker review that found no defense.

VM low-level option bypass (CVE-2026-34177)

This bypass allows for arbitrary QEMU configuration injection by exploiting an incomplete blocklist in restricted projects. In combination with another finding, which identified that raw.apparmor is also not restricted, this allows a restricted user to escape to host root.

Finding details:


Exploitation trace on LXD 6.7 (https://github.com/canonical/lxd/tree/lxd-6.7)  (four steps, generated by the investigator):


Also produced for this finding: full code-location evidence and a debunker review that searched for a runtime guard on the unlisted key and found none.

Backup restore desynchronization (CVE-2026-34178)

This vulnerability exploits the discrepancy between how LXD validates a backup index and how it actually imports the internal backup configuration. This desynchronization allows an attacker to sneak forbidden security configurations past the project's restriction checks.

Finding details:


Exploitation trace on LXD 6.7 (https://github.com/canonical/lxd/tree/lxd-6.7)  (seven steps, generated by the investigator):


Also produced for this finding: full code-location evidence across four source files and a debunker review that searched for a missing reconciliation step and found none.

None of these findings is exotic. Missing fields in allowlists, short denylists, divergent validation paths – these exist in every mature codebase. The difficulty has always been identifying where to focus across a few hundred thousand lines of code.

What this changes in practice

Redhound does not replace the tools we already run. SAST, fuzzing, dependency scanning, and human review keep doing what they do well, and Redhound feeds into the same review pipeline.

What changes is what each review can reach. Audits begin from an attack-surface map, candidate findings with full exploitation traces, and a record of hypotheses already debunked. Logic bugs that have historically survived years of expert scrutiny become tractable, and reviewer judgment is spent where it matters most: assessing real-world impact, and engineering architectural fixes.

What's next

Internally at Canonical, tools like Redhound are now becoming a part of how we work every day, not only as a single audit but introducing it as a recurring practice. Our goal is to incorporate agentic security auditing into our existing processes to elevate the security posture of Canonical's products across the board.

Disclosure

All three findings were disclosed to the LXD team, fixed in coordinated releases, and assigned CVEs CVE-2026-34177 (https://ubuntu.com/security/CVE-2026-34177) , CVE-2026-34178, (https://ubuntu.com/security/CVE-2026-34178)  and CVE-2026-34179 (https://ubuntu.com/security/CVE-2026-34179) . Thanks to the LXD team for triaging and patching all three.

AI is accelerating and improving how security engineers find and fix vulnerabilities. A new tool developed and used at Canonical, called Redhound, has already uncovered three critical logic vunerabilites, paving the way for a more secure software landscape.


Categories: AI, Security, Vulnerabilities
Source: https://ubuntu.com//blog/finding-the-blind-spot-how-canonical-hunts-logic-flaws-with-ai May 15, 2026, 11:53 AM