In early April 2026, Hackread summarized a LayerX disclosure with a headline that landed hard for a reason: Claude Code could be manipulated through CLAUDE.md to carry out a SQL injection workflow in a controlled lab setup. The news angle was the SQLi demo. The more important story was the trust boundary behind it. Claude Code is not a passive chatbot. Anthropic documents it as an agentic coding tool that can read your codebase, edit files, run commands, and integrate with development tools. Anthropic also documents CLAUDE.md as a persistent instruction file that loads into context across sessions. Put those two facts together and the repo stops being just source code. It becomes part prompt, part policy, part tool-control surface. (hackread.com)
That change matters more than the phrase “SQL injection” in the headline. SQLi is familiar. What is new is the path by which a repository file can influence a tool-using coding agent that has access to a shell, a working directory, optional network paths, and optional external systems through MCP. LayerX’s controlled DVWA demonstration is a sharp case study because it shows the model treating text inside CLAUDE.md as authorization context, then carrying that context forward into practical action selection. Anthropic’s own materials say prompt injection is not a solved problem, especially as models take more real-world actions. The LayerX report is what that warning looks like once it reaches a developer workstation. (LayerX)
The point is not that Claude Code is uniquely reckless or that every repo using it is automatically dangerous. Anthropic’s documentation describes a permission-based architecture, read-only defaults, command approvals, sandboxing, trust verification, hooks, and managed settings. The point is that these controls only make sense if teams correctly identify what the sensitive surface actually is. A lot of teams already review Dockerfile, package.json, and CI workflows as executable infrastructure. Fewer teams review CLAUDE.md, .claude/settings.json, .claude/rules/*.md, or .mcp.json with the same rigor. In an agentic coding environment, that gap is no longer academic. (Claude)
Claude Code Is Not Just Chat With Shell Access

Anthropic’s product description is more precise than the popular shorthand around “AI coding assistant.” Claude Code is an agentic coding tool. The docs say it reads your codebase, edits files, runs commands, and integrates with development tools. The “how it works” documentation goes further and spells out the working context: your project files, your terminal, your git state, and your CLAUDE.md. The MCP documentation extends that again by allowing Claude Code to connect to tools, databases, and APIs through Model Context Protocol servers. That is a wider operational surface than a browser chat tab, and it is the right starting point for understanding why a repo-local instruction file can become a security issue. (Claude)
A regular web chatbot can give bad advice. A coding agent can do more than that. It can assemble context from a repo, decide on a next action, invoke a file tool, invoke a shell command, and potentially touch a connected service. Anthropic’s security documentation reflects that difference. It describes strict read-only permissions by default, explicit approvals for additional actions, sandboxed bash with filesystem and network isolation, trust verification for first-time codebase runs and new MCP servers, and command blocklists for risky network fetch tools like curl and wget by default. None of those controls would exist if Claude Code were only a text generator. They exist because the product is an action system. (Claude)
That distinction also explains why “harmless prompt engineering” and “security-sensitive repository state” can collapse into the same problem. Anthropic’s docs encourage developers to use CLAUDE.md for persistent project instructions. They also support imports, nested instruction loading, rules, hooks, settings, and MCP configuration living alongside the code. When the model’s memory and the model’s tools meet inside the same working tree, text is no longer only narrative. Text becomes part of execution governance. (Claude)
The table below captures the practical difference between a general chat surface and an agentic coding runtime. It is not a product-comparison table. It is a trust-boundary table.
| Capability area | Typical chat assistant | Claude Code runtime |
|---|---|---|
| Repository awareness | Only what the user pastes | Direct repo reading, git awareness, working-directory context |
| Persistent project instructions | Usually temporary conversation context | CLAUDE.md, rules, local memory, imported files |
| File modification | None by default | Built-in edit and write tools, plus Bash subprocesses |
| Command execution | None | Bash tool with approvals, optional sandboxing |
| External systems | Usually none | MCP servers, CLI tools, WebFetch, browser-connected workflows |
| Trust question | Is the answer correct | Is the repo, config, tool path, and runtime policy trustworthy |
Anthropic’s own documentation is what makes this framing necessary. The more Claude Code is allowed to do, the more the security question moves from “Can the model be tricked into saying something bad” to “What does the runtime treat as trusted input when deciding what actions to take.” (Claude)
How CLAUDE.md Really Works
CLAUDE.md is easy to misunderstand because it looks like markdown. Anthropic’s documentation describes it more functionally: a persistent instruction file that carries project-specific context across Claude Code sessions. The memory docs say each session starts with a fresh context window and that CLAUDE.md is one of the mechanisms that carries knowledge across sessions. The docs also make clear that these files are loaded in full, even though shorter files usually produce better adherence. That matters. A file loaded every session is not a note to future humans. It is an instruction input to the agent. (Claude)
The loading model is broader than many teams expect. Anthropic documents that Claude Code walks up the directory tree from the current working directory and loads CLAUDE.md and CLAUDE.local.md files it finds. The discovered files are concatenated into context rather than overriding one another. Within the same directory, CLAUDE.local.md is appended after CLAUDE.md. Files in subdirectories can also be picked up lazily when Claude reads files in those locations. For a large mono-repo or a shared engineering tree, that means the instruction surface can change based on where the agent is operating. That is powerful for legitimate workflows and dangerous when nobody is tracking what actually loaded. (Claude)
The import model extends the surface again. Anthropic’s memory docs say CLAUDE.md can import additional files with @path/to/import syntax, and imported files are expanded and loaded into context at launch with recursive imports supported up to five hops. That feature is useful for project overviews, workflow guides, or shared team rules. It is also a reminder that the security boundary is not the one visible file a reviewer happens to click open. A short harmless-looking CLAUDE.md can still pull in more instruction material from elsewhere in the repo or even from absolute paths the user approves. (Claude)
The .claude directory formalizes the rest of the behavior surface. Anthropic’s directory documentation says Claude Code reads instructions, settings, skills, subagents, rules, hooks, MCP config, and memory from the project directory and from ~/.claude. The file reference table explicitly lists CLAUDE.md, rules, settings.json, skills, commands, output styles, agents, persistent memory, and project-shared MCP servers. That is the deeper architectural lesson in this whole story. In a modern agentic coding tool, the repo is not just code. It is code plus runtime configuration plus memory plus policy plus tool integration. (Claude)
The following table is a better mental model for defenders than “there is one weird markdown file I should maybe glance at.”
| File or directory | What it does | Why it is security-relevant |
|---|---|---|
CLAUDE.md | Loads instructions every session | Can shape task framing, authorization assumptions, and tool behavior |
CLAUDE.local.md | Personal project-local overrides | Can silently change behavior for one operator or one worktree |
.claude/rules/*.md | Topic or path-scoped instructions | Can inject context when specific files are touched |
.claude/settings.json | Permissions, hooks, env vars, model defaults | Can alter approval behavior and runtime constraints |
.mcp.json | Team-shared MCP servers | Can expand the systems Claude can access |
.claude/hooks/ | Runtime automation and policy logic | Can allow, deny, defer, or audit tool use |
~/.claude/projects/.../memory/ | Auto memory for the project | Can persist operational assumptions across sessions |
~/.claude/projects/.../*.jsonl | Full transcripts | Can capture sensitive file contents, command output, and pasted secrets |
That last row deserves more attention than it gets. Anthropic documents that transcripts and history are plaintext on disk and are not encrypted at rest. If a tool reads a secret file or a command prints a credential, that value can land in session transcripts. This does not create the LayerX bypass, but it increases the blast radius when a session is already operating under polluted context. (Claude)
What LayerX Showed, and What It Did Not Show
LayerX’s public write-up is very specific on the core mechanism. The post says system prompts in Claude Code are handled through CLAUDE.md, that the file sits in the code repository, and that anyone with write permission can edit it for an entire project. In the controlled demo, the researchers used DVWA, placed three short authorization-style lines into CLAUDE.md, and then prompted Claude Code to assist with bypassing a login and dumping the password database. LayerX says Claude explicitly cited CLAUDE.md as the authorization basis for the task and proceeded through a sequence that included multiple SQLi payload attempts, cURL requests, setting DVWA security to low, discovering the current database, listing tables, and dumping usernames and password hashes. (LayerX)
Hackread’s article condenses that into a media narrative, but the summary is directionally consistent with the LayerX post. The Hackread piece says Claude Code’s safety rules were bypassed through CLAUDE.md, that the attack was demonstrated in a controlled environment using a vulnerable app, and that the model treated the text file as its basis for proceeding. For technical readers, the important thing is not that a media outlet said “SQL injection.” The important thing is that the media summary correctly identified the mechanism as a trust problem around CLAUDE.md, not as a claim that Anthropic shipped a built-in SQLi mode. (hackread.com)
LayerX also framed the risk broader than one SQLi demo. The write-up presents three vectors: penetration test and data exfiltration, malicious public repository, and insider threat. That framing is stronger than the headline because it makes clear that the dangerous object is not a specific payload string. It is the combination of a trusted instruction channel and an action-capable runtime. A malicious public repository is especially important because developers routinely clone code they did not write. In a normal developer mental model, that is a supply-chain concern if the repo contains dangerous shell scripts, dependencies, or CI configs. LayerX’s point is that the instruction layer itself can be part of that same supply chain. (LayerX)
What the disclosure did not show is equally important. It was not a claim that Claude Code can always bypass every safety control. It was not a claim that any arbitrary target on the public internet can be compromised without user interaction. It was not a full exploit against Anthropic infrastructure. It was a controlled demonstration that a repo-resident instruction file can be treated as trusted authorization context strongly enough to push the model toward harmful behavior inside a tool-using environment. That is already serious. It does not need exaggeration. (LayerX)
There is also a vendor-disclosure nuance worth noting. LayerX says it submitted the issue to Anthropic through HackerOne and was redirected to another reporting channel for model safety concerns. Anthropic’s public materials show both a HackerOne-linked model safety bug bounty program and a separate usersafety@anthropic.com reporting path for current-system safety issues and jailbreak-style concerns. In other words, the reporting path for “model safety and jailbreak” findings is not identical to the path for classical software vulnerabilities. That distinction makes sense operationally, but it also illustrates why this class of issue sits awkwardly between application security, product security, and model safety. (LayerX)

Why SQL Injection Showed Up in the Demo
SQL injection is not the deepest part of the LayerX story, but it is not accidental either. SQLi is a clean demonstration vehicle for agentic misuse because it is procedural, testable, and easy to narrate. It gives the model a recognizably offensive task with obvious intermediate milestones: find the login surface, try bypasses, inspect database context, enumerate tables, and extract credentials. A researcher can show each stage in screenshots without having to publish full malware or a fragile multi-host exploit chain. That makes the technique good for disclosure and good for headlines. (LayerX)
More importantly, SQLi highlights the distinction between explanation and execution. Any capable LLM can talk about SQL injection in the abstract. Claude Code is different because the runtime can move from discussion into shell-assisted action. Anthropic’s own documentation says the tool can run commands, and LayerX says the demo used cURL requests that Claude generated and attempted to run after treating CLAUDE.md as project authorization. That is the step security teams should care about: when an instruction file plus a prompt plus tool access becomes a real action loop. (Claude)
There is no need to reproduce LayerX’s commands or payloads to understand the operational implication. The lesson is general. Any offensive workflow that can be decomposed into “state the authorization, generate the next step, execute through allowed tools, inspect the result, continue” becomes more plausible once the model trusts the repo’s text as a legitimate source of permission. SQLi happened to be the demonstration path. The architecture problem is much larger. (LayerX)
The Real Root Cause Is a Trust Decision, Not a Prompt Trick
The narrow version of this story is “prompt injection in a markdown file.” The more accurate version is “an action-capable system is relying on an untrusted input when making security-relevant decisions.” That wording matters because it lines up with both software engineering and later CVE history in Claude Code itself. Anthropic’s docs say CLAUDE.md provides persistent instructions. LayerX showed that the model could treat those instructions as authorization context. Once a system uses repo-controlled text to help decide whether harmful actions are permissible, the problem is bigger than stylistic prompt manipulation. It becomes a trust-model bug. (Claude)
Anthropic’s own security docs support that framing. The docs say prompt injection is an attempt to override or manipulate an AI assistant’s instructions by inserting malicious text, and that Claude Code includes safeguards like explicit approval for sensitive operations, context-aware analysis, input sanitization, command blocklists, separate context windows for WebFetch, and trust verification for first-time codebase runs and new MCP servers. Those are not cosmetic features. They are attempts to bound what the runtime treats as trustworthy when it is acting on real systems. The LayerX case shows why such boundaries are hard: the instruction file is not obviously “external content” in the way a fetched webpage is. It lives inside the project itself. (Claude)
This is also why the repository has to be reclassified in security thinking. For years, engineering teams have been trained to view the repo as a mix of code, configs, test fixtures, docs, and build logic. Claude Code adds another layer: repo-native behavior shaping. Anthropic’s .claude directory docs show instructions, hooks, skills, rules, subagents, persistent memory, settings, and MCP configuration all living in or around the project. Once the agent reads and obeys that layer, the repo becomes a capability surface. A repo compromise is no longer only about what gets compiled or deployed. It is also about what the agent believes and what the agent is allowed to do next. (Claude)
That framing also explains why standard “the user still had to approve it” responses are incomplete. Approval is a downstream control. The earlier trust decision is upstream. If the model has already internalized a false claim like “this is an authorized pentest against our own site” from a trusted project memory file, then every later tool request is being proposed inside a poisoned interpretation of scope and legitimacy. A human may still catch it. A tired human may not. Anthropic’s docs explicitly acknowledge prompt fatigue as a design concern and support allowlists and auto mode partly to reduce repetitive approvals. That is good product design, but it means the integrity of the earlier context becomes even more important. (Claude)
Why Permission Prompts Are Not the Same Thing as Boundaries
Anthropic deserves credit for documenting the distinction between permissions and sandboxing clearly. The permissions docs say permissions control which tools Claude can use and which files or domains it can access, while sandboxing provides OS-level enforcement for the Bash tool’s filesystem and network boundaries. The docs explicitly recommend using both for defense in depth, because sandbox restrictions can still block access outside defined boundaries even if prompt injection compromises Claude’s decision-making. That is the right model. A permission prompt is a user decision point. A sandbox is an enforced barrier. They are not the same layer. (Claude)
Anthropic’s docs also contain a subtle but important warning about network controls. They say command blocklists stop risky tools like curl and wget by default, but they also note that Bash permission patterns trying to constrain command arguments are fragile. The permissions docs give concrete examples showing how URL filtering based on shell pattern matching can fail through argument reordering, redirects, variable usage, or protocol changes. The same page recommends denying generic network tools and instead allowing WebFetch for approved domains. It also warns that using WebFetch alone does not prevent network access if Bash is still allowed. That is one of the most useful security notes in the entire Claude Code documentation set. (Claude)
Another common misunderstanding is file-read blocking. Anthropic’s docs say Read deny rules apply to built-in file tools, but not to Bash subprocesses. A deny rule blocking Read(.env) does not stop cat .env if Bash access is permitted. Anthropic recommends sandboxing for OS-level enforcement in that case. That matters because many of the nightmare scenarios people imagine around agentic coding involve exactly that pattern: the model deciding to read secrets or config files through shell commands rather than through the built-in read tool. If a team assumes deny rules by themselves are a hard barrier, they are misreading the architecture. (Claude)
Auto mode adds another layer of nuance. Anthropic’s auto-mode docs say a separate classifier can review actions, maintain a fixed safe-tool allowlist, and block only what looks risky. The permission-modes docs say auto mode still follows a fixed decision order and falls back when the classifier blocks repeatedly. That is useful for productivity and can reduce mindless clicking. It is not a replacement for repo hygiene, instruction-file review, or sandbox configuration. A classifier can judge an action only after the system has already framed what task it thinks it is doing. If the repo-local instruction layer is misleading, the classifier is operating downstream of that framing. (anthropic.com)
The same is true of “bypass permissions” modes. Anthropic’s desktop docs say bypass-permissions mode is equivalent to --dangerously-skip-permissions and should be used only in sandboxed containers or VMs. Enterprise admins can disable it. That warning should be read literally. If a repo, a helper, or a bad operational habit lands the agent in a permissive execution mode without strong external isolation, the remaining safeguards are thinner than many users assume. (Claude)
The Repo Attack Surface Is Bigger Than One Markdown File

Once you stop treating CLAUDE.md as the whole story, the broader attack surface becomes easier to reason about. Anthropic’s .claude directory docs list not just instructions, but settings, hooks, skills, rules, subagents, MCP servers, persistent memory, and local transcripts. Each one expands what an attacker-controlled repository or a compromised teammate account can influence. CLAUDE.md matters because it is the most visible. .claude/settings.json matters because it can shape permission behavior and hooks. .mcp.json matters because it can expand what external systems Claude can talk to. Rules matter because they can load conditionally when the model reads certain paths. Hooks matter because they can directly allow, deny, modify, or defer tool execution. (Claude)
Hooks are especially important because Anthropic documents them as shell commands, HTTP endpoints, or LLM prompts that execute automatically at specific lifecycle points. The PreToolUse hook can allow, deny, ask, or defer tool calls and can even modify tool input before execution. The ConfigChange hook can block configuration changes from taking effect during a running session. The InstructionsLoaded hook fires when CLAUDE.md or rules are loaded, though Anthropic notes that this one is asynchronous and does not support blocking. In practical terms, that means defenders can build observability and policy enforcement into Claude Code, but they need to understand which lifecycle stage actually has teeth. InstructionsLoaded is excellent for logging. PreToolUse and ConfigChange are where enforcement starts. (Claude)
MCP widens the blast radius again. Anthropic’s MCP docs say Claude Code can connect to hundreds of external tools and data sources, and the security docs say Anthropic does not manage or audit any MCP servers. The docs encourage either writing your own servers or using servers from providers you trust. That is a sensible warning. It is also a strong signal that connected data sources are now part of the attack surface. A polluted instruction context plus an over-privileged MCP server is a much more dangerous combination than a polluted instruction context alone. (Claude API Docs)
And then there is data residue. Anthropic’s directory docs say that anything passing through a tool can land in plaintext transcripts on disk, including file contents, command output, and pasted text. The docs even note that in non-interactive mode, --no-session-persistence can be used with -p to avoid writing transcripts entirely. That is the kind of operational detail that matters in real environments. A compromised or manipulated session does not end when the agent stops running. It can leave behind durable artifacts that contain exactly the secrets the attacker wanted the agent to touch. (Claude)
Supply Chain, Insider Threat, and Non-Interactive Risk
LayerX’s three scenarios deserve to be treated as a supply-chain taxonomy rather than a one-off disclosure. The first case is the public malicious repository. A developer clones code, starts Claude Code, and inherits instructions they did not author and may never review. Anthropic’s docs do have trust verification for first-time codebases, but the trust question only works if the user understands what they are being asked to trust. Many developers read that dialog as “do I trust the code in this repo to be worth working on,” not “do I trust the instruction, policy, and external integration surfaces that will shape my agent’s behavior.” LayerX’s write-up exposes exactly that mismatch. (LayerX)
The second case is the insider or compromised-identity path. LayerX explicitly calls out disgruntled employees, compromised accounts, or malicious contractors modifying an existing CLAUDE.md. In many engineering teams, that change would not trigger the same review urgency as a change to build scripts, secrets tooling, or deployment workflows. Anthropic’s docs make clear it should. CLAUDE.md is loaded every session, and settings or hooks can change what the agent is allowed to do. An attacker does not need to hide a shell script if they can hide a trusted instruction channel. (LayerX)
The third case is non-interactive use. Anthropic’s security docs state that trust verification is disabled in non-interactive mode with the -p flag. That is understandable from a workflow perspective, but it changes the security posture. The human who might have noticed an odd instruction file or a suspicious repo is no longer in the loop at session start. If that non-interactive flow is also paired with permissive settings, broad allow rules, or high-value credentials in the environment, the margin for error gets smaller very quickly. Anthropic’s docs do provide defensive mitigations here, including dontAsk, managed settings, sandboxing, and --no-session-persistence, but teams have to turn those into policy rather than leaving them as optional footnotes. (Claude)
The CVE Trail Shows This Is a Product Class Problem
The LayerX CLAUDE.md story becomes much easier to place once you look at the public Claude Code CVE trail from late 2025 into 2026. These are not all the same bug. They are a map of where trust breaks down in an agentic coding runtime: before trust dialogs, inside permission resolution, in command parsing, in project-load behavior, and in helper execution. Read together, they show a consistent pattern. The challenge is not just “prevent malware” or “block one exploit.” The challenge is designing a runtime that can act usefully while making correct trust decisions about repositories, settings, prompts, and command boundaries. (nvd.nist.gov)
A useful starting point is CVE-2025-59536. NVD says versions before 1.0.111 were vulnerable to code injection because of a bug in the startup trust dialog implementation, allowing Claude Code to execute code contained in a project before the user accepted the trust dialog. The significance is straightforward: the “first open” moment is security-critical. If project content can influence execution before trust is established, the trust dialog has already lost part of its meaning. That CVE is directly relevant to the CLAUDE.md discussion because the LayerX disclosure is also about what happens when repo-controlled material gets treated as trustworthy earlier than it should. (nvd.nist.gov)
Then there is CVE-2026-21852, one of the clearest malicious-repo cases in Claude Code’s public record. NVD says versions before 2.0.65 allowed malicious repositories to exfiltrate data, including Anthropic API keys, before users confirmed trust. The reported mechanism was an attacker-controlled repository setting ANTHROPIC_BASE_URL to an attacker endpoint so that Claude Code issued requests before showing the trust prompt. This is not the same thing as CLAUDE.md, but it is the same security lesson: repository-controlled state influencing sensitive behavior before trust is meaningfully established. (nvd.nist.gov)
CVE-2026-33068 gets even closer to the trust-boundary theme. NVD says versions prior to 2.1.53 resolved the permission mode from settings files, including the repo-controlled .claude/settings.json, before determining whether to display the workspace trust confirmation dialog. A malicious repository could set permissions.defaultMode to bypassPermissions, causing the trust dialog to be skipped on first open and placing the user into a permissive mode without explicit consent. If you wanted a textbook example of “reliance on untrusted input in a security decision,” you do not need to invent one. NVD already classifies the weakness that way. (nvd.nist.gov)
CVE-2025-54795 shows why “there was still a confirmation prompt” cannot be treated as a final answer. NVD says that in versions below 1.0.20, an error in command parsing made it possible to bypass the Claude Code confirmation prompt and trigger execution of an untrusted command, given the ability to add untrusted content into a Claude Code context window. The fix version matters, but the architectural lesson matters more. Prompts, dialogs, and human approvals are not stable guarantees if the system’s parsing and trust model are wrong upstream. (nvd.nist.gov)
CVE-2026-25723 adds another important piece. NVD says versions before 2.0.55 could bypass file-write restrictions via piped sed operations with echo, enabling writes to sensitive directories like .claude and paths outside project scope when “accept edits” was enabled. That is highly relevant to CLAUDE.md and .claude security because it means a successful hostile context or tool path is not limited to reading sensitive files or generating risky commands. Under some conditions, it can also become self-reinforcing by writing back into the policy surface that shapes future agent behavior. (nvd.nist.gov)
Two April 2026 command-injection disclosures widen the picture beyond repo trust and into helper paths. CVE-2026-35021 describes OS command injection in the prompt-editor invocation utility via crafted file paths, and CVE-2026-35022 describes command injection in authentication helper execution where helper configuration values were executed with shell=true, enabling credential theft and environment-variable exfiltration. These are different bugs, but they point to the same operational truth: once a coding agent accumulates editors, helper binaries, shell wrappers, settings, and provider integrations, security review has to cover the whole execution fabric, not just the model and not just the repo. (nvd.nist.gov)
The table below is the shortest way to see the pattern.
| CVE | What failed | Why it matters here | Fixed version or status |
|---|---|---|---|
| CVE-2025-59536 | Startup trust dialog bug allowed code execution before trust acceptance | “Untrusted repo” is a live execution boundary, not a formality | Fixed in 1.0.111 (nvd.nist.gov) |
| CVE-2026-21852 | Repo-controlled settings could exfiltrate API keys before trust prompt | Malicious repositories can abuse config before user consent is meaningful | Fixed in 2.0.65 (nvd.nist.gov) |
| CVE-2026-33068 | Repo-controlled .claude/settings.json could set bypassPermissions before trust dialog | Security decisions must not rely on repo-controlled input before trust is established | Fixed in 2.1.53 (nvd.nist.gov) |
| CVE-2025-54795 | Confirmation prompt bypass via command parsing flaw | Approval prompts are not hard boundaries | Fixed in 1.0.20 (nvd.nist.gov) |
| CVE-2026-25723 | File-write restriction bypass into .claude and outside project scope | The agent’s own control surface can become writable under the wrong conditions | Fixed in 2.0.55 (nvd.nist.gov) |
| CVE-2026-35021 | OS command injection in prompt editor invocation | Helper execution paths are part of the attack surface | Public NVD entry published April 6, 2026 (nvd.nist.gov) |
| CVE-2026-35022 | OS command injection in auth helper execution | Credential helpers can become exfiltration paths | Public NVD entry published April 6, 2026 (nvd.nist.gov) |
The broader takeaway is not “Claude Code is broken by design.” It is that this product class is difficult by design. It has to decide what to trust while acting inside partially trusted environments. That is exactly the class of system where subtle sequencing bugs, repo-controlled configuration, and permission-model surprises keep turning into real vulnerability records. (nvd.nist.gov)
A Defensive Triage Playbook for Real Teams
The fastest way to improve security is to stop asking “Do we use Claude Code” and start asking “What inputs actually shape Claude Code’s behavior in this repo.” The first triage step is inventory. Find every CLAUDE.md, CLAUDE.local.md, .claude/rules/*.md, .claude/settings*.json, .mcp.json, and hook script in the project. Anthropic’s docs also suggest using /memory, /context, /permissions, /hooks, and /mcp to inspect what loaded in the current session. That gives you both a static view of what exists on disk and a runtime view of what the agent actually consumed. (Claude)
A practical defensive starting point is a local repo audit that treats agent-control files as sensitive artifacts:
find . \
\( -name 'CLAUDE.md' -o -name 'CLAUDE.local.md' -o -name '.mcp.json' \) \
-print
find ./.claude -type f \
\( -name 'settings*.json' -o -path './.claude/rules/*.md' -o -path './.claude/hooks/*' \) \
-print 2>/dev/null
rg -n \
'bypassPermissions|ANTHROPIC_BASE_URL|apiKeyHelper|awsAuthRefresh|gcpAuthRefresh|curl|wget|@[^ ]+' \
. .
This kind of scan will not prove malicious intent, but it will quickly surface the places where repo-controlled text can influence permissions, network routing, helper execution, imports, or tool use. That is enough to move the discussion from vague concern to concrete review. The need for this kind of scan follows directly from Anthropic’s documented file layout, import model, settings surface, and helper-related CVEs. (Claude)
The second step is change control. Put these files under CODEOWNERS or equivalent approval rules. Anthropic’s docs explicitly say project files in .claude can be committed to git and shared across the team. That makes version control a security boundary, not just a collaboration tool. If a repository already requires review for CI workflows, container definitions, or deployment configs, the same rule should apply to CLAUDE.md, rules, settings, and .mcp.json. (Claude)
The third step is runtime instrumentation. Anthropic’s hook system is strong enough to enforce meaningful local policy. ConfigChange hooks can block configuration changes from taking effect in a running session. PreToolUse hooks can deny or defer Bash, WebFetch, Read, Edit, Write, Agent, and MCP tool calls. InstructionsLoaded hooks cannot block, but they can still log and alert whenever a new instruction file enters context. That means a mature team can detect “a nested CLAUDE.md just loaded because Claude traversed into a subdirectory” or “a session just tried to use Bash against a suspicious pattern.” (Claude)
A minimal Bash-deny hook to block obvious network utilities and writes into the agent’s own policy directory might look like this:
#!/usr/bin/env bash
# .claude/hooks/pretool-guard.sh
INPUT="$(cat)"
TOOL_NAME="$(jq -r '.tool_name' <<<"$INPUT")"
if [ "$TOOL_NAME" = "Bash" ]; then
CMD="$(jq -r '.tool_input.command // ""' <<<"$INPUT")"
if echo "$CMD" | grep -Eq '(^|[[:space:]])(curl|wget)([[:space:]]|$)'; then
jq -n '{
hookSpecificOutput: {
hookEventName: "PreToolUse",
permissionDecision: "deny",
permissionDecisionReason: "Direct network utilities are blocked. Use approved WebFetch domains or a reviewed wrapper."
}
}'
exit 0
fi
if echo "$CMD" | grep -Eq '(^|[[:space:]])(cat|cp|mv|sed|tee).*(\.claude/|CLAUDE\.md|\.env)'; then
jq -n '{
hookSpecificOutput: {
hookEventName: "PreToolUse",
permissionDecision: "ask",
permissionDecisionReason: "Sensitive policy or credential path. Manual review required."
}
}'
exit 0
fi
fi
exit 0
This is not a universal policy engine. It is a reminder that Anthropic’s documented hook surface is strong enough to implement local controls, and that those controls should focus on the files and actions that actually shape agent behavior. (Claude)
Hardening Claude Code Without Pretending the Risk Is Zero
The hardest mistake to fix is organizational, not technical. Teams need to start treating repo-resident agent instructions as executable governance. Once that shift happens, the rest of the hardening path becomes clearer.
First, tighten repository trust hygiene. Anthropic has already documented that trust verification exists for first-time codebases and new MCP servers, but it also documents that trust verification is disabled in non-interactive -p mode. That means one policy should be simple: do not run Claude Code non-interactively against unreviewed repositories, and do not assume the same human checkpoint exists in automation that exists in local interactive use. For CI or other unattended flows, pair -p with locked-down permissions, strict sandboxing, and --no-session-persistence where possible. (Claude)
Second, enforce sandboxing as a requirement, not a convenience. Anthropic’s sandboxing docs say effective sandboxing requires both filesystem and network isolation, and that without network isolation a compromised agent could exfiltrate sensitive files, while without filesystem isolation it could backdoor local resources to gain network access. The docs also say the sandbox can fail open by default if dependencies are missing, unsupported, or restricted, unless sandbox.failIfUnavailable is set to true. In security-sensitive environments, that setting should not be left implicit. If the sandbox is part of the security design, failure to start it should stop work, not produce a warning banner that gets ignored. (Claude)
Third, default-deny direct network tooling in Bash. Anthropic’s own permissions docs are unusually explicit here: argument-based Bash URL rules are fragile, and teams should block curl, wget, and similar tools, then allow WebFetch(domain:...) for specific domains. The same docs also warn that if Bash is allowed, Claude can still use those utilities to reach arbitrary URLs unless you deny them or constrain them through sandboxing and hooks. That is not a niche edge case. It is a baseline design rule for any environment where the agent can encounter untrusted content. (Claude)
A hardened settings example can embody that rule:
{
"permissions": {
"deny": [
"Bash(curl*)",
"Bash(wget*)",
"Read(./.env)",
"Edit(.claude/**)",
"Write(.claude/**)"
],
"allow": [
"WebFetch(domain:docs.anthropic.com)",
"WebFetch(domain:code.claude.com)"
]
},
"sandbox": {
"failIfUnavailable": true,
"allowedDomains": [
"docs.anthropic.com",
"code.claude.com"
]
},
"defaultMode": "default"
}
No static snippet is complete, and the precise paths should match your environment. The important thing is the shape of the policy: narrow the network layer, protect the agent-control surface, and fail closed when isolation is unavailable. Anthropic’s docs support all three moves directly. (Claude)
Fourth, control configuration drift during sessions. Anthropic’s ConfigChange hook can block configuration changes from taking effect. That is not just an audit feature. It can be a live policy brake. In practical terms, a team can refuse attempts to change permission modes, add risky hooks, or reconfigure settings once a session has started. That is especially useful because some of Claude Code’s riskier historical CVEs involved repo-controlled settings influencing security decisions too early or too broadly. (Claude)
Fifth, restrict MCP aggressively. Anthropic’s docs say Claude Code can connect to hundreds of external tools and that Anthropic does not manage or audit MCP servers. That is a strong reason to classify MCP endpoints by sensitivity. Documentation and issue trackers are one thing. Databases, production admin surfaces, cloud credential brokers, or internal secrets systems are another. If Claude Code does not need target-side live access for a given use case, do not give it that access. Least privilege matters just as much for agent tools as it does for service accounts. (Claude API Docs)
Sixth, manage local data exposure. Anthropic’s docs state that transcripts and history are plaintext on disk and protected only by OS file permissions. They also document that anything passing through a tool can land in a transcript file. The result is simple: if your use of Claude Code touches secrets, regulated data, or sensitive internal context, local storage policy matters. Lower retention windows. Consider dedicated workstations or containers. Use --no-session-persistence in unattended runs. And do not assume a safe session stays safe after it exits if the disk retains everything the agent saw. (Claude)
A Practical Boundary, Reasoning Is Not Proof

There is a second mistake teams make after underestimating repo-side risk: they overestimate what a coding agent proves. Claude Code can be very strong at reading code, tracing auth logic, mapping trust boundaries, reviewing patches, drafting regression checks, and building local validation loops. Anthropic’s public docs support that positioning, and the product is clearly designed for tool-using, iterative technical work. But a polished explanation from a coding agent is still not the same thing as demonstrated impact against a live target. (Claude)
That distinction matters in this story too. The LayerX demo is compelling because it crossed from reasoning into action inside a controlled environment. The right defensive response is not “never use coding agents for security work.” It is to separate white-box reasoning from target-side proof. Claude Code is very useful on the reasoning side: narrowing hypotheses, formalizing checks, reviewing changes, and understanding the local system around the code. It is weaker as a final evidence engine for real exposure, real exploitability, or real blast radius unless an independent verifier checks the claims. Anthropic’s own prompt-injection and security materials point in exactly that direction by emphasizing verifiers, approvals, and constrained real-world action. (anthropic.com)
That is also where a workflow-native pentest platform enters naturally, without turning the discussion into a sales page. Penligent’s public materials position the platform around running tasks, verifying findings, and producing reports against real targets, and one of its own Claude Code comparison pieces frames the split as white-box auditing and patch direction on one side versus black-box proof and re-verification on the other. That is a sensible operational divide. Use a coding agent where code access, local tools, and patch reasoning create leverage. Use a target-facing workflow when the question becomes “Can I prove this is real under current deployment conditions and preserve the evidence.” (penligent.ai)
What Mature Teams Should Change This Quarter
The simplest version of the answer is policy, review, and isolation.
Start by classifying the following as sensitive files: CLAUDE.md, CLAUDE.local.md, .claude/settings*.json, .claude/rules/*.md, .mcp.json, and any hook or helper scripts that Claude Code can invoke. Put them behind code ownership and change review. Review imports. Review nested instruction paths. Review MCP additions. Review permission-mode changes. Review any configuration that touches helper execution or network routing. The public CVE history already shows that these are not decorative files. They are security-relevant inputs. (Claude)
Then lock down runtime defaults. Deny generic network tools in Bash. Prefer WebFetch on approved domains. Turn on sandboxing and set it to fail closed. Use PreToolUse hooks for additional policy enforcement. Use ConfigChange hooks to stop runtime drift. Disable or tightly restrict bypass-permissions modes. And treat non-interactive use as a separate trust tier with stronger controls, not as “the same thing but automated.” Anthropic’s documentation already gives the building blocks. The real work is turning them into defaults. (Claude)
Finally, retrain the mental model. The repo is not passive anymore. In an agentic coding environment, it can carry instructions, rules, permissions, memory, tool integrations, and helper logic that materially affect what the agent sees and what it tries to do. That means repository trust is now partly execution trust, partly policy trust, and partly data trust. If teams keep reviewing only the code that ships to production while ignoring the agent-governance files that shape local and automated action, they are defending the wrong boundary. (Claude)
The Core Lesson
The LayerX disclosure is memorable because it used SQL injection as the demonstration path. The durable lesson is broader. CLAUDE.md is a visible example of something larger: repo-native policy has become executable context for agentic developer tools. Anthropic’s own docs show how much power lives inside that context, from persistent instructions and imports to settings, hooks, MCP servers, and local transcripts. The public CVE trail shows how often security problems in this product class appear exactly where trust is established, skipped, or misapplied. The right response is not panic and not denial. It is to treat the repo as part of the runtime, and to harden it accordingly. (Claude)
Further Reading and References
Anthropic’s official Claude Code overview, memory, security, permissions, hooks, sandboxing, MCP, and auto-mode materials are the best primary sources for how the product is supposed to behave, what files shape behavior, and what guardrails exist in the documented model. (Claude)
LayerX’s original “Vibe Hacking” disclosure is the core public source for the CLAUDE.md security-bypass narrative, the DVWA demonstration, the three threat scenarios, and the recommendation to treat CLAUDE.md like executable code. Hackread’s report is useful as a concise media summary of the same research. (LayerX)
The NVD entries for CVE-2025-59536, CVE-2026-21852, CVE-2026-33068, CVE-2025-54795, CVE-2026-25723, CVE-2026-35021, and CVE-2026-35022 are the most relevant public records for understanding how Claude Code trust, permissions, startup behavior, file boundaries, and helper execution have already produced real vulnerability disclosures. (nvd.nist.gov)
For related Penligent reading that stays close to this topic, the most relevant English pages are Claude Code Security Bypass Research, Claude AI for Pentest Copilot, Building an Evidence-First Workflow With Claude Code, and Claude Code Security and Penligent, From White-Box Findings to Black-Box Proof. The Penligent homepage is the broadest product reference if you need the general workflow context after reading those pieces. (penligent.ai)

