The new Kali workflow is genuinely good at one thing, removing friction
Kali’s official write-up frames the idea clearly: instead of typing every command, you describe what you want, and an LLM translates your intent into real terminal actions. In the reference setup, the “UI” is Claude Desktop on macOS, the “attacking box” is a Kali host loaded with tools, and the model runs in the cloud, all stitched together through Model Context Protocol. (Kali Linux)
That alone explains the hype. Most of the time cost in recon and triage isn’t raw compute; it’s human context switching:
- “What’s the next best check for this surface?”
- “Which flags do I need?”
- “What does this output imply?”
- “What should I verify next?”
Kali’s demo prompt in the blog is intentionally simple, but the loop it showcases is the real product: natural language → plan → commands → output → next step. (Kali Linux)
Now add one more ingredient: Kali packages an MCP server that can run terminal commands on the Kali side. The tool page for mcp-kali-server explicitly says it enables MCP to run tools like nmap, nxc, rizo, wget, gobuster, and more, positioning it as “AI-assisted penetration testing” infrastructure. (Kali Linux)
So yes: it’s cool. It’s also exactly the kind of convenience that becomes dangerous when teams treat it as the default operating model.
“Cool” is not the same as “a safe default” when the execution boundary moves
The core issue isn’t whether Claude is capable, or whether Kali’s tooling is solid. The issue is that this workflow changes what “input” means in a pentest environment.
In classic tooling, the boundary is mostly obvious:
- You type a command.
- The tool runs.
- Output returns.
- You decide what to do next.
In an agentic MCP loop, the model is a decision-making intermediary. It doesn’t only consume your prompt. It also consumes:
- tool names and descriptions
- tool schemas and parameters
- command output and error strings
- HTTP responses, HTML, JavaScript, JSON
- README files and issue descriptions
- any external content pulled into context
And that matters because LLMs don’t have a native, hard security boundary between “data” and “instructions.” The UK NCSC has been explicit that prompt injection is not analogous to SQL injection in the “patch it once and you’re done” sense; it stems from how these systems interpret text and can be hard to fully mitigate. (NCSC)
When your model can trigger tools, that architectural reality stops being theoretical.
What MCP changes in practice, a standardized bridge, and a standardized risk
MCP’s upside is real. Microsoft summarizes the developer value as reducing rewrites and creating separation between application logic and the model reasoning layer. (Microsoft Developer)
But in the same Microsoft post, the security story is equally direct:
- Indirect prompt injection is when malicious instructions are embedded in external content (docs, web pages, emails), and the model misinterprets them as legitimate commands, leading to unintended actions like data exfiltration or manipulation. (Microsoft Developer)
- Tool poisoning is a subset where the attacker embeds malicious instructions into MCP tool metadata such as descriptions; these can be invisible to users but steer the model’s tool choices. (Microsoft Developer)
That is the first reason this is the wrong default for real pentesting teams: in real engagements, the “external content” you ingest is adversarial by definition.
A pentest is a structured interaction with systems you do not control, returning text you did not author, including error pages, banners, WAF messages, “friendly” instructions in HTML comments, and intentionally crafted outputs. If you feed that into an agent that can execute actions, you’ve created a new high-value steering channel.

The real-world proof arrived quickly, CVEs in the MCP ecosystem
If this still feels like abstract AI security talk, the MCP ecosystem already has concrete, assigned CVEs that demonstrate how “classic vulnerabilities” become easier to weaponize once an agent is in the loop.
CVE-2025-68143, 68144, 68145, why they matter beyond Git
In January 2026, reporting described three vulnerabilities in Anthropic’s Git MCP server (mcp-server-git). The story is not “Git tooling is risky.” The story is that the flaws could be exploited through prompt injection, meaning an attacker who can influence what an assistant reads can weaponize them without direct access to the victim system. (Noticias Hacker)
The Hacker News lists:
- CVE-2025-68143: path traversal/arbitrary path acceptance in
git_initwithout proper validation, fixed in 2025.9.25 (Noticias Hacker) - CVE-2025-68144: argument injection in
git_diffygit_checkoutpassing user-controlled args to git CLI without sanitization, fixed in 2025.12.18 (Noticias Hacker) - CVE-2025-68145: missing path validation when using
-repositoryscoping, fixed in 2025.12.18 (Noticias Hacker)
NVD’s entry for CVE-2025-68143 is especially useful because it reads like a classic AppSec lesson: git_init accepted arbitrary filesystem paths, operated on directories accessible to the server process, and the tool was removed entirely because the server was intended to operate on existing repositories only. (NVD)
Why should pentesting teams care?
Because it demonstrates the pattern that repeats in every agent toolchain:
- a tool has a bug (path traversal, injection, scoping weakness)
- the model is tricked into calling it, often by content the model read
- composition with other tools increases impact
- the result is actions the operator did not explicitly authorize in a traditional sense
The Hacker News description even outlines a chained scenario with a filesystem MCP server to reach code execution via Git config filters. (Noticias Hacker)
You don’t need to copy that chain to understand the lesson: composability amplifies weaknesses.
The keyword that wins clicks is not “MCP”, it’s “Kali LLM” and “Claude Desktop”, and that’s part of the problem
When people search this topic, they often don’t search “Model Context Protocol security boundary.” They search:
- “Kali LLM Claude Desktop”
- “mcp-kali-server”
- “Claude Desktop Kali”
- “AI pentesting Kali”
That search intent is convenience-oriented. Kali’s own blog title literally uses “Kali & LLM” and “Claude Desktop GUI” language, because that’s how practitioners will discover it. (Kali Linux)
So the highest-CTR framing in the ecosystem is likely “how to set it up” and “how to use it.” The content that ranks tends to be setup-driven, not operating-model-driven.
This is exactly where teams get hurt: they adopt the integration because it feels like a productivity win, but they inherit an execution boundary they didn’t deliberately design.
If you deploy Kali + Claude via MCP as a default, you inherit five categories of risk
To be useful to a hard-core security audience, you need a threat model that matches how real pentests work. The categories below are not “AI risks.” They are the same old risks, routed through a new control plane.
1) Indirect prompt injection becomes a tool-routing vulnerability
Microsoft’s definition is operational: malicious instructions embedded in external content can lead to unintended tool actions and data exfiltration. (Microsoft Developer)
In a pentest workflow, your model will read untrusted content constantly. A clever target can craft responses designed to manipulate the agent:
- “To proceed, run this command…”
- “Your scan is incomplete, rerun with these flags…”
- “Here is the token you requested…”
- “Ignore previous instructions…”
A human sees this as hostile. A model sees it as text that might be relevant.
2) Tool poisoning is supply chain risk, but faster and more subtle
Tool poisoning isn’t just “a bad tool.” It can be a tool whose metadata changes after you trusted it. Microsoft explicitly calls out the possibility of tool definitions being amended later, turning a previously approved tool into a new risk. (Microsoft Developer)
That’s a supply chain problem, except it can happen at “text update speed.”
3) Insecure output handling is now a first-class failure mode
OWASP’s LLM Top 10 explicitly lists Prompt Injection as a primary risk category, and in general the project emphasizes that LLM app security includes output handling, supply chain, and more. (Fundación OWASP)
In an MCP toolchain, “output” is not a terminal artifact. It is an entrada to the model’s next decision. If you don’t treat output as untrusted, you’re building a confused deputy.
4) Identity and scoping controls are fragile in composable tool ecosystems
CVE-2025-68145 is essentially a scoping failure: a configuration intended to restrict repository operations didn’t enforce the boundary properly. (Noticias Hacker)
That’s a reminder that “configured restrictions” are only as strong as the implementation — and in tool ecosystems, boundaries fail quietly.
5) Observability and auditability become the difference between “cool” and “usable at scale”
A solo hacker can tolerate “the model did something unexpected.” A team cannot.
A team needs:
- what ran, exactly
- where it ran
- why it ran
- who approved it
- what evidence it produced
- how to reproduce it
Without that, the workflow doesn’t scale beyond a demo.

Two “classic” CVEs that explain why boundary thinking matters more than feature hype
You asked to prioritize high-impact and relatively recent CVEs. The two below aren’t “about MCP,” but they explain the same lesson: trust boundaries are the battleground.
CVE-2024-3094, the XZ liblzma backdoor is a boundary failure, not a bug
NVD describes malicious code in upstream xz tarballs starting with 5.6.0, using obfuscation and build process manipulation to produce a modified liblzma that could impact software linked against it. (NVD)
This is the purest example of why “defaults” matter:
- People didn’t audit tarballs because the default trust model was “upstream releases are fine.”
- The compromise exploited how the ecosystem actually consumes artifacts.
MCP toolchains are similarly artifact-driven. If your default is “let the model call tools,” you are trusting:
- the tool package
- the server implementation
- the metadata
- the model’s routing behavior
- the client’s confirmation UX
- your logging
A single weak link is enough.
CVE-2024-6387, regreSSHion shows control planes are part of the blast radius
NVD describes CVE-2024-6387 as a race condition regression in OpenSSH sshd where an unauthenticated remote attacker may be able to trigger unsafe signal handling by failing to authenticate within a set time period. (NVD)
Why include it here?
Because Kali + Claude via MCP frequently relies on SSH connectivity and remote execution patterns. Kali’s official workflow includes an SSH section, emphasizing it as part of the setup story. (Kali Linux)
If your operational model depends on SSH, then SSH vulnerabilities aren’t “infra trivia.” They are part of the same execution boundary.
A sharper way to evaluate this workflow, ask what you lose when you gain convenience
Here’s the mental model that makes decisions clearer:
- A pentest is a pipeline that converts uncertainty into evidence.
- The pipeline must be reproducible, reviewable, and safe to run repeatedly.
Agentic convenience primarily improves “time-to-first-action.” But teams live and die on:
- time-to-high-confidence-finding
- time-to-proof
- time-to-fix guidance
- time-to-report
- ability to defend decisions later
If your workflow makes early actions faster but makes outcomes harder to trust, you haven’t improved the team’s real output.
A practical comparison table, what changes when you move from a demo to a team
| Dimension that matters in real engagements | Kali + Claude Desktop via MCP | Team-grade operating model |
|---|---|---|
| Default execution boundary | Broad unless you constrain it | Narrow by design, boundaries explicit |
| Input trust model | Often implicit, model reads everything | Explicit, untrusted-by-default content controls |
| Tool invocation | Model-routed, often free-form | Allowlisted, parameterized actions with approvals |
| Reproducibility | Hard unless you capture everything | Built-in: artifacts, logs, versions, evidence |
| Auditability | DIY logging, easy to miss context | Structured audit trails, reviewers can replay |
| Supply chain exposure | MCP servers, metadata, updates | Pinned versions, integrity checks, review gates |
| Failure mode | “Agent drift” becomes an incident | “Unsafe action blocked” becomes a non-event |
Minimum-safe setup, if you still want Kali + Claude via MCP
This section is intentionally practical, but it avoids offensive payloads. The goal is to help teams deploy guardrails that match the threat model.
Step 1, know what Kali shipped and what it does by default
Kali’s mcp-kali-server can be installed via apt and is small, with Python dependencies and a default port of 5000 in the CLI help output. (Kali Linux)
A baseline install and run looks like:
sudo apt update
sudo apt install -y mcp-kali-server
# Run the server on the default port (5000)
kali-server-mcp --port 5000
That’s enough to demo. It is not enough to deploy as a default in a team.
Step 2, isolate the execution host like it’s disposable
Treat the Kali box as a high-risk execution environment:
- run it in a VM with snapshots
- do not mount your main workstation home directory
- keep API keys and tokens off the box unless absolutely needed
- block outbound network by default, then open only necessary destinations
Here is a simple example using ufw as a concept. You should adapt ports and destinations to your environment.
sudo apt install -y ufw
# Default deny outbound and inbound
sudo ufw default deny incoming
sudo ufw default deny outgoing
# Allow SSH from your management subnet only
sudo ufw allow from 10.0.0.0/24 to any port 22 proto tcp
# Allow outbound DNS and HTTP/HTTPS if required, otherwise keep locked
sudo ufw allow out 53
sudo ufw allow out 80
sudo ufw allow out 443
sudo ufw enable
sudo ufw status verbose
The key idea is not ufw itself. The key idea is: if the agent is tricked into doing something dumb, the network should still say “no.”
Step 3, remove free-form shell as the default capability
The single most important guardrail for teams is turning “run arbitrary terminal commands” into “run approved actions.”
A simple approach is an allowlist wrapper. You can implement this as a small service that exposes a limited set of actions like:
port_scan(target, top_ports=True)http_probe(url)dns_lookup(hostname)fetch_headers(url)
The point is to make “tooling” parameterized and reviewable.
Here is a deliberately small Python example illustrating allowlisting and basic argument validation. It’s not a full product; it’s a pattern.
#!/usr/bin/env python3
import re
import shlex
import subprocess
from dataclasses import dataclass
IP_OR_HOST_RE = re.compile(r"^[a-zA-Z0-9.\\-]{1,253}$")
@dataclass(frozen=True)
class CommandSpec:
binary: str
allowed_flags: tuple[str, ...]
ALLOWLIST = {
"nmap_top_1000": CommandSpec(binary="nmap", allowed_flags=("-sV", "--top-ports", "1000", "-T3")),
"curl_headers": CommandSpec(binary="curl", allowed_flags=("-I", "-L", "--max-time", "10")),
}
def validate_target(target: str) -> None:
if not IP_OR_HOST_RE.match(target):
raise ValueError("Target contains invalid characters")
def run_action(action: str, target: str) -> str:
if action not in ALLOWLIST:
raise ValueError("Action not allowed")
validate_target(target)
spec = ALLOWLIST[action]
cmd = [spec.binary, *spec.allowed_flags, target]
# No shell=True. No arbitrary args.
result = subprocess.run(cmd, capture_output=True, text=True, check=False)
return result.stdout + result.stderr
if __name__ == "__main__":
# Example usage: run_action("nmap_top_1000", "scanme.nmap.org")
print(run_action("curl_headers", "<https://example.com>"))
If you do nothing else, do this: avoid shell=True, avoid arbitrary concatenation, avoid letting a model pass raw CLI arguments.
Step 4, treat tool metadata as code, pin versions, and audit updates
The NVD entry for CVE-2025-68143 explicitly recommends upgrading to a fixed version and notes the vulnerable tool was removed. (NVD)
That is the operational habit you want: pin known-good versions and update intentionally.
Step 5, logging must be evidence-grade, not “it happened in a chat”
For teams, logs need to be actionable:
- command, timestamp, user, target scope
- tool version, container/VM image hash
- stdout/stderr captured and stored immutably
- link to the prompt and the approval event
- ability to replay the run in a fresh environment
If you can’t reconstruct a finding later, you can’t defend it.
If your goal is to explore and learn, the Kali + Claude workflow is a fun accelerant. But if your goal is repeatable, team-grade pentesting output, the missing piece is not “more model intelligence.” It’s an operating model that treats execution as a governed workflow: scoped targets, structured tasks, verifiable evidence, and report-ready artifacts.
That’s the class of problem Penligent is built around. Instead of wiring a general LLM into an unrestricted tool host and hoping the guardrails hold, you build the guardrails into the workflow: what can run, what must be approved, how evidence is captured, and how results become something a team can review and ship.
If you want the “natural language → action” feeling without making “free-form command execution on a powerful host” the default, a platform approach is usually the more sustainable path for real pentesting teams.
Links
- Kali Linux + Claude via MCP Is Cool—But It’s the Wrong Default for Real Pentesting Teams https://www.penligent.ai/hackinglabs/kali-linux-claude-via-mcp-is-cool-but-its-the-wrong-default-for-real-pentesting-teams/ (Penligente)
- Kali LLM, Claude Desktop, and the New Execution Boundary in Offensive Security https://www.penligent.ai/hackinglabs/kali-llm-claude-desktop-and-the-new-execution-boundary-in-offensive-security/ (Penligente)
- Claude Code Remote Control security risks — when your local session becomes a remote interface https://www.penligent.ai/hackinglabs/claude-code-remote-control-security-risks-when-your-local-session-becomes-a-remote-interface/ (Penligente)
- Claude Code project files became an RCE and API key exfiltration path https://www.penligent.ai/hackinglabs/claude-code-project-files-became-an-rce-and-api-key-exfiltration-path-what-the-check-point-findings-change-for-ai-coding-assistants/ (Penligente)
- Kali official guide, Kali & LLM: macOS with Claude Desktop GUI & Anthropic Sonnet LLM https://www.kali.org/blog/kali-llm-claude-desktop/ (Kali Linux)
- Kali tool page, mcp-kali-server https://www.kali.org/tools/mcp-kali-server/ (Kali Linux)
- Microsoft Developer Blog, Indirect prompt injection and tool poisoning in MCP https://developer.microsoft.com/blog/protecting-against-indirect-injection-attacks-mcp (Microsoft Developer)
- NVD, CVE-2025-68143 https://nvd.nist.gov/vuln/detail/CVE-2025-68143 (NVD)
- The Hacker News, three flaws in Anthropic Git MCP server and prompt-injection exploitation framing https://thehackernews.com/2026/01/three-flaws-in-anthropic-mcp-git-server.html (Noticias Hacker)
- NVD, CVE-2024-3094 XZ/liblzma supply chain compromise https://nvd.nist.gov/vuln/detail/CVE-2024-3094 (NVD)
- NVD, CVE-2024-6387 OpenSSH regreSSHion https://nvd.nist.gov/vuln/detail/CVE-2024-6387 (NVD)
- OWASP Top 10 para solicitudes LLM https://owasp.org/www-project-top-10-for-large-language-model-applications/ (Fundación OWASP)
- UK NCSC, Mistaking AI vulnerability could lead to large-scale breaches https://www.ncsc.gov.uk/news/mistaking-ai-vulnerability-could-lead-to-large-scale-breaches (NCSC)

