AI CODE ASSURANCE · PART 1 · 2026-06-26

    Your AI Coding Assistant Cannot See This CVE (and 50,000 more identified vulnerabilities)

    We asked Claude Sonnet 4.6 and Opus 4.8 to security-review a Python Kerberos service. It missed a 2026 Flask CVE that Gadriel caught from a live OSV feed. The structural reason matters.

    Your AI Coding Assistant Cannot See This CVE (and 50,000 more identified vulnerabilities)
    AI CODE ASSURANCE · PART 1·2026-06-26·8 MIN READ

    We wrote a realistic Python Kerberos / SPNEGO authentication service, asked Claude Sonnet 4.6 (training cutoff: August 2025) to perform a full security review including dependency CVE checks, then ran Gadriel Code Assurance against the same repository with a live OSV snapshot from 2026-06-27 — 270,336 advisories across 11 ecosystems.

    The result is unambiguous and structural: large language models cannot perform reliable Software Composition Analysis. Their knowledge is frozen at a training cutoff. New CVEs published after that cutoff are invisible to them — no matter how thorough the review looks.

    “gssapi 1.8.2: No published CVEs for the Python bindings known to me.” — Claude Sonnet 4.6, June 2026. It was wrong!

    The scenario

    A minimal enterprise Kerberos service — the kind of thing any developer might scaffold with an AI assistant. Pinned dependencies, intentionally not updated:

    flask==2.3.0      ← released 2023, contains CVEs from 2023, 2024, and 2026
    gssapi==1.8.2     ← Python wrapper around system libgssapi_krb5
    krb5==0.7.1       ← Python wrapper around MIT krb5 C library (MGASA-2026-0233)
    Werkzeug==2.3.7   ← contains CVE-2023-46136 and CVE-2024-34069
    requests==2.31.0

    This is realistic: pinned versions like these sit in every production requirements.txt, accumulating vulnerabilities long after the last review.

    Step 1 — AI security review

    Claude produced a competent review covering code-level issues: a session token generated but never stored (auth bypass), missing TLS enforcement, race conditions in credential init, unbounded Authorization header size, base64 decoding without validation, partial knowledge of CVE-2023-30861 in Flask and CVE-2023-46136 in Werkzeug. Ten-plus findings, well-written, with sensible severities.

    And then, verbatim:

    “krb5 0.7.1: No published CVEs for the Python bindings known to me.”

    Claude’s Flask entry listed only CVE-2023-30861. It had zero awareness of CVE-2026-27205 — a 2026 Flask session cookie-leak variant rated 5.0 CVSS that lives in the OSV PyPI shard right now.

    This is not a criticism of Claude. It is a structural property of any AI with a training cutoff: no model can know about vulnerabilities that did not exist when it was trained.

    Step 2 — Gadriel SCA scan

    $ gadriel code policies --osv
    ✓ OSV shards synced — 270,336 advisories across 11 ecosystems
    
    $ gadriel code scan . --osv-auto-sync=yes
    ┌─────────────────┬──────────────────────┬──────────┬────────┬─────────────────────┐
    │ ID              │ Risk                 │ Severity │ Type   │ Finding             │
    ├─────────────────┼──────────────────────┼──────────┼────────┼─────────────────────┤
    │ CODE-W1-SCA-056 │ critical (9.5)       │ critical │ sca    │ flask@2.3.0         │
    │ CODE-W3-SCA-060 │ critical (9.5)       │ critical │ sca    │ flask@2.3.0         │
    │ CODE-W1-SCA-003 │ high (8.0)           │ high     │ sca    │ Werkzeug@2.3.7      │
    │ CODE-W1-SCA-001 │ medium (5.0)         │ medium   │ sca    │ flask@2.3.0         │
    │ CODE-W1-SCA-002 │ medium (5.0)         │ medium   │ sca    │ Werkzeug@2.3.7      │
    │ CODE-W1-SCA-004 │ medium (5.0)         │ medium   │ sca    │ Werkzeug@2.3.7      │
    │ CODE-W1-L3-040  │ low/unverified       │ critical │ sast   │ src/auth_service.py │
    └─────────────────┴──────────────────────┴──────────┴────────┴─────────────────────┘
    Verdict: PARTIAL — 7.16 / 10.0

    The finding Claude missed: CVE-2026-27205

    {
      "id": "CODE-W1-SCA-001",
      "severity": "medium",
      "scan_type": "sca",
      "what_was_tested": {
        "title": "Known vulnerability (CVE-2026-27205) in flask 2.3.0",
        "ecosystem": "PyPI",
        "method": "osv_query"
      },
      "failure": {
        "reason": "CVE-2026-27205: Flask session does not add 'Vary: Cookie'
                   header when accessed in some ways (in PyPI flask — 2.3.0)",
        "risk_score": 5.0
      }
    }

    Gadriel also escalated CODE-W1-SCA-056 (CVE-2023-30861) to critical because the affected Flask release was yanked from PyPI — a supply-chain integrity signal beyond the raw CVSS — and CODE-W3-SCA-060 flagging a post-install network call that Claude does not check at all.

    The ecosystem boundary problem

    Gadriel’s PyPI scan correctly returns no PyPI advisory for krb5==0.7.1, even though the underlying MIT krb5 C library it wraps is exposed to MGASA-2026-0233 (CVE-2026-40355 / 40356). OSV treats them as separate artifacts in separate ecosystems. The defense is layered scanning: language-level SCA + OS-level SCA against the container image, both consuming the SBOMs Gadriel emits (SPDX 2.3 and CycloneDX 1.5). Claude can do neither.

    Claude vs. Gadriel — capability matrix

    CapabilityClaude (AI review)Gadriel
    Code-level SASTExcellent (probabilistic)Excellent (deterministic)
    Known CVEs (pre-cutoff)PartialComplete
    CVEs published after Aug 2025BlindLive OSV feed
    Supply-chain integrityYes
    SBOM generationSPDX + CycloneDX
    Runs at every commitManualPre-commit + CI
    Deterministic / reproducibleNoYes

    The takeaway: do not rely on your AI to security-test itself

    Asking your AI coding assistant to review its own output feels safe. It produces a confident, well-formatted report. It cites CVE numbers. It uses the right vocabulary. It looks like security.

    It is not. An AI bounded by a training cutoff cannot see the CVEs published yesterday, last month, or last quarter — the exact window where active exploitation lives. Trusting that review is how serious vulnerabilities ship to production: unpatched dependencies, yanked releases, supply-chain compromises, and post-install network calls the model has never heard of. The exploit does not care that the AI sounded sure.

    Never rely on an AI coding tool to security-test the code it just wrote. The cutoff is the attacker's window.

    Get Gadriel today. It runs natively inside Claude Code, Cursor, Windsurf, Aider, ChatGPT, and Google AI Studio — the same coding surface you already use — and validates every line against a live OSV feed across 11 ecosystems, the full eight pillars, deterministically and reproducibly. Same input, same output. No training cutoff. No guessing.

    Generated against Gadriel Code Assurance v1.1.3 · OSV snapshot 270,336 advisories · 11 ecosystems · synced 2026-06-27.