For a while, “AI in cyber security” was treated like a branding exercise. Vendors stapled a chatbot onto an alert queue, called it autonomous, and hoped nobody looked too closely. That stage is over. The current evidence from Microsoft, Google Cloud, IBM, NIST, OWASP, and MITRE all points to a harsher reality: AI is now meaningful in cyber security, but only when it is tied to concrete operational problems such as detection speed, investigation depth, identity abuse, governance, data protection, and validation of real attack paths. At the same time, those same sources are equally clear that AI expands the attack surface, accelerates attacker tradecraft, and introduces failure modes that do not map cleanly to older software-security playbooks. (NIST)
That is the right way to frame the subject. AI in cyber security is not one thing. Cisco defines it in practical terms as the use of algorithms and machine learning techniques to analyze large, complex data sets, identify patterns and anomalies, and recommend ways to reduce risk. NIST’s emerging Cyber AI Profile sharpens the picture further by separating the space into three risk domains: the cybersecurity of AI systems, AI-enabled cyber attacks, and AI-enabled cyber defense. That distinction matters because a team can be excellent at using AI inside the SOC and still be dangerously weak at securing the AI systems it has deployed. (Cisco)
The strongest public material currently surfacing around this topic keeps circling the same themes for a reason. Microsoft focuses on threat triage, phishing at scale, and autonomous assistance for defenders. Google emphasizes reducing analyst toil, accelerating investigation and detection engineering, and defending against identity-heavy campaigns. IBM leans into governance, access controls, data protection, and measurable savings from security AI when it is used extensively and controlled properly. OWASP, NIST, CISA, and MITRE add the part that vendor marketing often softens: prompt injection, data and model poisoning, supply chain risk, model-serving vulnerabilities, and new AI-native abuse cases are already real enough to require formal frameworks and shared taxonomies. (Google Cloud)
The result is that the conversation has matured. A serious team is no longer asking whether AI belongs in the stack. It is asking where AI produces a clear operational advantage, where humans must remain primary decision-makers, which AI workloads deserve the same hardening as any other critical service, and how to test both classic infrastructure and AI-enabled workflows under real adversarial pressure. That is the version of the topic worth publishing for security engineers, red teams, pentesters, and bug bounty practitioners.

What AI in cyber security actually means
At an engineering level, AI in cyber security now spans at least four different layers. The first is statistical detection and anomaly analysis, which existed long before the current generative wave. The second is analyst augmentation: natural-language querying, case summarization, log explanation, detection-rule generation, and threat-investigation assistance. The third is orchestration and automation, where AI helps correlate signals, prioritize incidents, recommend actions, and sometimes execute bounded workflows. The fourth is autonomous or semi-autonomous offensive and defensive action, where the system does more than summarize — it preserves state, uses tools, interacts with targets, and produces evidence that another engineer can verify. Public vendor and standards material increasingly treats these layers differently, because the risks and benefits are not interchangeable. (Google Cloud)
This is where a lot of public writing still goes wrong. A model that explains a SIEM alert in plain English is not equivalent to a system that can generate reliable detections. A model that writes a decent KQL query is not the same as one that can validate a suspicious identity path across logs, IAM policy, and cloud control-plane events. And an AI assistant that suggests a next step to a pentester is not the same as a platform that can maintain authenticated state, pivot across a workflow, prove exploitability, and produce a reproducible report. The difference is not semantic. It is operational. It determines whether the system reduces workload or merely changes its shape. (Microsoft Learn)
The most useful way to think about the space is to separate “language convenience” from “security capability.” Natural-language interfaces are valuable. They let junior analysts interrogate a data lake, translate suspicious scripts, or understand detection logic faster. Microsoft explicitly frames Security Copilot around investigating and remediating threats, building KQL queries, and analyzing suspicious scripts. Google similarly highlights natural-language querying, detection-rule creation, playbook building, and faster incident resolution in SecOps. Those are real gains. But they are not the entire story, and they should not be confused with autonomous security competence. (Microsoft Learn)
The same distinction applies on the offensive side. Today’s market still uses “AI pentest tool” loosely, but the technically relevant public material increasingly converges on a harder definition: a real system must carry out recognizable penetration-testing work, maintain context, handle stateful applications, reason over attack paths, and produce evidence rather than elegant summaries. That framing is especially important because many teams are now evaluating whether AI belongs only in detection and response, or whether it should also be trusted to validate exposure continuously in staging and production-like environments. (Penligente)
The table below is a practical way to separate the categories.
| Camada | What it does well | Where it fails |
|---|---|---|
| Statistical and ML detection | Finds anomalies, scores behavior, correlates high-volume signals | Struggles with context, business logic, and trustworthy explanations |
| Analyst copilot | Summarizes incidents, writes queries, explains scripts, reduces toil | Can hallucinate, over-prioritize polished language over precision |
| AI-assisted orchestration | Builds playbooks, correlates multi-step events, recommends response actions | Can overreach when permissions, data quality, and guardrails are weak |
| Autonomous validation and testing | Maintains workflow state, proves impact, generates evidence, retests continuously | Requires strong controls, bounded scope, logging, and human review |
That table is synthesis, but it maps closely to the way current major vendors and standards bodies describe the space: bounded assistant use is already mainstream; deeper autonomy is emerging; and governance, validation, and risk management become more important as the system gains more access, more memory, and more ability to act. (IBM)

Where AI already delivers real defensive value
The least controversial value of AI in cyber security is speed. Not “speed” in the vague product-marketing sense, but speed in the places where defenders are chronically overloaded: alert triage, investigation assistance, correlation of related events, and compression of repetitive analyst work. IBM describes AI-powered security as a way to improve the speed, accuracy, and productivity of security teams, and notes that AI-powered risk analysis can produce higher-fidelity summaries and accelerate alert investigations and triage. Google makes a similar case for SecOps, describing how Gemini can reduce toil, simplify interaction with security data, and help teams create detections, build playbooks, and investigate threats faster. (IBM)
Microsoft’s public numbers explain why this matters. In March 2025, Microsoft said it was detecting more than 30 billion phishing emails across 2024, processing 84 trillion threat signals per day, and seeing about 7,000 password attacks every second. Those numbers do not mean AI magically solves the problem. They do mean the old assumption that humans will manually inspect the majority of this surface is gone. At this scale, AI’s immediate value is not that it replaces analysts. It makes it possible for analysts to spend their time where human judgment actually matters: unusual incidents, cross-system ambiguity, edge-case access paths, and complex remediation. (Microsoft)
This is why the best security-AI deployments do not start with autonomy. They start with compression. Can the model summarize ten related alerts into one incident hypothesis. Can it write a first-pass query that a human tunes rather than starting from scratch. Can it translate a suspicious PowerShell or Bash fragment into readable intent. Can it surface likely blast radius faster than an analyst manually jumping across tabs. Those are narrow wins, but they are operationally meaningful because they reclaim analyst hours without pretending the model is a final authority. Microsoft explicitly positions Security Copilot around this kind of work: triage, remediation guidance, query generation, and script analysis. (Microsoft Learn)
The next level of value is sequence detection. Many important attacks are not hidden because any one event is invisible. They are missed because the meaningful pattern is distributed across time, identities, APIs, services, and hosts. AWS’s GuardDuty Extended Threat Detection is a good example of how cloud defenders are operationalizing AI here. AWS describes AI and ML being used to correlate multiple signals into attack-sequence findings that can span privilege discovery, API manipulation, persistence activity, and exfiltration, with natural-language summaries and ATT&CK-aligned remediation guidance. That is exactly the kind of task where pattern recognition across sprawling telemetry is useful. (Amazon Web Services, Inc.)
Identity defense is another area where AI is already justified, partly because the threat landscape has moved there so aggressively. Google Cloud highlighted that stolen credentials, phishing, brute-force, and other identity-based vectors accounted for 37 percent of successful breaches in 2024, citing Mandiant’s latest M-Trends findings. That number lines up with a broader industry shift away from malware-first thinking toward identity abuse, session theft, infostealers, and privilege misuse. In this environment, AI earns its keep when it helps security teams recognize the difference between a noisy login event and a meaningful credential-risk pattern tied to the user, device, session, and downstream behavior. (Google Cloud)
That identity shift also explains why AI is increasingly valuable outside the SOC dashboard itself. It belongs in access review, non-human identity governance, privileged-account analysis, and session anomaly detection. IBM’s 2025 breach-cost material emphasizes that organizations seeing AI-related incidents often lacked proper AI access controls, and it explicitly ties identity security and modern phishing-resistant authentication to practical risk reduction. If an organization adds AI agents, model-serving endpoints, or retrieval workflows without tightening identity boundaries, it is not modernizing security. It is multiplying the number of ways a compromised token can become a larger incident. (IBM)
A third mature use case is data security. This does not get the same attention as flashy model demos, but it matters more. IBM’s 2025 breach report frames the current moment as an “AI oversight gap,” highlighting that ungoverned AI systems are more likely to be breached and more costly when they are. The same report says 97 percent of organizations that reported an AI-related security incident lacked proper AI access controls, while 63 percent lacked AI governance policies to manage AI or prevent shadow AI. Those numbers explain why AI in cyber security cannot be reduced to detection models alone. It must include data discovery, classification, access controls, key management, and visibility into where AI workloads are actually running. (IBM)
The defensive use cases that keep showing up across major vendors are not random. They map to real pain points that have survived every hype cycle: too much telemetry, too many repetitive tasks, too much delay between signal and meaning, too little visibility into risky access, and too much difficulty connecting scattered events into one defensible incident narrative. AI helps most where the problem is volume plus pattern plus language. It helps least where the problem is trust, intent, and business context that exists only in people, change-management systems, or application-specific workflows.
The attacker benefits too, and that changes the baseline
Any honest article about AI in cyber security has to say this plainly: defenders are not the only side getting leverage. CrowdStrike’s 2025 Global Threat Report said GenAI-powered deception was rising, reported a 442 percent increase in vishing, and described broader growth in malware-free, identity-based attacks. Google Cloud’s threat-intelligence material likewise says state-backed actors and cybercriminals are integrating AI across the attack lifecycle, moving beyond simple productivity gains. Microsoft’s Digital Defense Report 2025 includes a case study in which a criminal network exploited stolen API keys to bypass AI safety controls and generate abusive AI-produced images. AI is not only a shield. It is a force multiplier for misuse, scale, deception, and experimentation. (CrowdStrike)
The easiest place to see this is social engineering. Phishing always depended on language quality, contextual plausibility, and scale. Generative models reduce the cost of all three. They help attackers write cleaner phishing emails, translate content naturally, mimic role-appropriate tone, localize messages, and iterate lures faster. When combined with deepfake audio or synthetic identity enrichment, the problem shifts from “spot the broken English email” to “recognize manipulation inside a message that looks entirely normal.” CrowdStrike’s focus on AI-powered deception and Google’s emphasis on AI-enhanced phishing and vishing are both signals that social engineering is becoming less syntactically obvious and more operationally convincing. (CrowdStrike)
But language quality is only part of the shift. The more consequential change is that models increasingly help attackers operationalize environment-specific actions. Google Threat Intelligence Group wrote in late 2025 that it had observed malware families using LLMs during execution, dynamically generating malicious scripts, self-modifying for evasion, and receiving commands from AI models rather than only from traditional command-and-control infrastructure. GTIG’s PROMPTSTEAL example is important not because every attacker now has an LLM inside malware, but because it shows the direction of travel: models are beginning to influence runtime behavior, not just pre-attack planning. (Google Cloud)
That matters for defenders because it compresses the time between reconnaissance and action. An attacker no longer needs perfect scripting skill to adapt commands to a new environment. An AI-assisted workflow can help generate discovery commands, interpret tool output, summarize likely privilege paths, and convert messy notes into repeatable actions. That does not make every attacker sophisticated. It does make mediocre attackers faster. And mediocre attackers at scale are a big problem. Google’s 2026 and 2025 security forecasts both warn that phishing, vishing, SMS fraud, and other social-engineering attacks are increasingly AI-enhanced, while AWS has already described real campaigns where AI helped a relatively unsophisticated actor exploit weak fundamentals at scale. (Google Cloud)
This is why “AI changed everything” is both wrong and directionally useful. Wrong, because attackers still win through familiar failures: weak credentials, over-privileged access, exposed management surfaces, unsafe deserialization, broken session handling, and poor asset visibility. Useful, because AI helps attackers discover and chain those weaknesses faster, and helps them look more convincing while doing it. AWS’s write-up on AI-augmented attacks against FortiGate devices makes exactly that point: the campaign still depended on exposed interfaces, weak credentials, and single-factor authentication. AI did not replace fundamentals. It accelerated abuse of bad fundamentals. (Amazon Web Services, Inc.)
The practical lesson is that AI in cyber security does not replace conventional security engineering. It raises the cost of not doing it. If your environment is already brittle, AI increases the attacker’s return on that brittleness. If your environment is well controlled, AI can help you defend it more efficiently. The same technology amplifies whoever has better operational discipline.

AI systems are now part of the attack surface
The biggest mistake in the current market is treating AI only as a security tool rather than also as a security subject. NIST’s Cyber AI Profile makes the distinction explicit by naming the cybersecurity of AI systems as a risk domain in its own right, separate from AI-enabled attacks and AI-enabled defense. NIST’s Generative AI Profile adds that generative AI can create risks that are novel or that intensify traditional software risks, and it organizes risk management across governance, mapping, measurement, and management over the AI lifecycle. That language is not abstract policy filler. It is a practical warning that AI systems do not merely inherit normal software risks. They can enlarge them, connect them, or create new paths for them to matter. (NCCoE)
CISA, NSA, FBI, and partner agencies made the same point in their joint guidance on deploying AI systems securely. Their 2024 guidance says rapid adoption makes AI capabilities valuable targets for malicious actors, including attackers who may seek to co-opt deployed AI systems for malicious ends. The guidance is explicit that organizations should apply traditional IT best practices to AI systems, but it also stresses threat models, governance, model-source review, validation before deployment, and strict access controls and API security with least privilege and defense in depth. In other words, an AI deployment is not just another application rollout. It is a privileged system that often touches sensitive data, external content, and automated actions at the same time. (U.S. Department of War)
That is why the secure-AI-development guidance from the UK NCSC and international partners remains useful even in 2026. It structures AI security around four lifecycle areas: secure design, secure development, secure deployment, and secure operation and maintenance. It specifically calls out threat modeling, supply-chain security, incident management, protecting infrastructure and models from compromise, and making it easy for users to do the right thing. Those are boring phrases until you remember what modern AI systems actually do: call tools, consume untrusted content, store prompts, handle embeddings, manage tokens, and sometimes take actions in production systems. Once you see the full workflow, the need for lifecycle discipline stops looking optional. (NCSC)
OWASP’s 2025 Top 10 for LLMs and GenAI applications helps translate that lifecycle view into risks engineers can work with. The project highlights prompt injection, sensitive information disclosure, supply-chain risk, data and model poisoning, improper output handling, excessive agency, system prompt leakage, vector and embedding weaknesses, misinformation, and unbounded consumption. Those risks matter because they describe where modern AI applications routinely blur boundaries that older application-security models assumed were clearer: instruction versus data, retrieval versus trust, model output versus executable action, and access to information versus authority to act on it. (Projeto de segurança de IA da OWASP Gen)
The risk most engineers now recognize first is prompt injection. OWASP defines it as a vulnerability in which user inputs alter model behavior or output in unintended ways, including bypassing guidelines, enabling unauthorized access, or influencing critical decisions. What makes prompt injection serious is not that a model says something silly. It is that a model can be manipulated through data it is supposed to process, then pass that manipulation into tools, APIs, or downstream workflows. Once a retrieval pipeline, email parser, browser agent, ticket bot, or SOC assistant begins to trust model reasoning without enough isolation and validation, instruction and content become dangerously entangled. (Projeto de segurança de IA da OWASP Gen)
Excessive agency is the next risk that deserves more attention than it gets. OWASP frames it as the problem of giving an LLM-based system too much ability to act. In practice, this is where many “agentic AI” deployments become fragile. An assistant that can read a message is one thing. An agent that can open files, call tools, pull containers, edit tickets, invoke shell commands, approve access changes, or operate across SaaS systems is something else entirely. As soon as the system can act, not just advise, the question becomes: what is the authorization boundary, where is the approval checkpoint, how is intent verified, and what happens when the model is wrong for reasons that look plausible. (Projeto de segurança de IA da OWASP Gen)
System prompt leakage and vector or embedding weaknesses further prove that AI systems do not fit neatly inside old categories. A leaked system prompt may reveal hidden instructions, business rules, tool affordances, or defense assumptions. Weaknesses in embeddings or retrieval layers can corrupt the knowledge the model sees, poison ranking, or smuggle malicious content into trusted contexts. NIST’s GenAI Profile is useful here because it insists that organizations think about risk by lifecycle stage, by model and system level, by application level, and by ecosystem level. That broader lens is necessary because many AI failures do not live in one place. They emerge from the interaction between data, model, retrieval, permissions, and user workflow. (Projeto de segurança de IA da OWASP Gen)
MITRE ATLAS helps make the same point from a threat-informed-defense angle. ATLAS exists as a knowledge base of adversary tactics and techniques for AI-enabled systems, and MITRE’s 2025 Secure AI work says the adoption of AI introduces expanded threat landscape and new vulnerabilities, requiring incident sharing, verifiable AI vulnerability discovery, and AI red-teaming methods. This is one of the clearest signs that AI security has left the “interesting research topic” phase. Mature defenders now need common language for threats to AI systems and malicious use of AI in cyber operations, not just generic advice to “monitor your models.” (MITRE ATLAS)
A useful way to interpret the standards material is this: AI systems are not fragile because they are magical. They are fragile because they combine several historically hard problems into one place. They combine privileged access, complex supply chains, ambiguous instructions, external data, probabilistic behavior, and sometimes autonomous action. That makes them worth treating like critical infrastructure even when they arrive in the organization through a “productivity” budget line rather than a security or platform-engineering one.
The OWASP-style risk map in practical terms
| Risk area | What it looks like in production | Por que é importante |
|---|---|---|
| Prompt injection | Hidden or explicit instructions in documents, tickets, websites, emails, logs, or user messages alter model behavior | Can turn untrusted content into unauthorized actions or data leakage |
| Sensitive information disclosure | Prompts, context windows, retrieval results, logs, or outputs expose secrets or regulated data | Can leak credentials, PII, trade secrets, or system instructions |
| Supply chain | Models, plugins, agent tools, registries, Python packages, model repositories, and container images are compromised or weakly vetted | Lets attackers enter through dependencies rather than the front door |
| Data and model poisoning | Training, fine-tuning, or retrieval corpora are manipulated | Corrupts future outputs and may introduce backdoors or systematic bias |
| Excessive agency | Agents can act too broadly without strong checks | Converts model mistakes into operational incidents |
| System prompt leakage | Hidden instructions are exposed | Reveals design assumptions, internal rules, and possible bypass routes |
| Vector and embedding weaknesses | Retrieval ranking or knowledge injection behaves unexpectedly | Lets bad data distort what the model “knows” at decision time |
| Unbounded consumption | Queries or workflows drive runaway usage | Creates cost, availability, and denial-of-service risks |
That table is a direct engineering reading of the OWASP 2025 LLM risk set plus the lifecycle and governance framing from NIST and the secure-deployment guidance from government partners. The important point is not memorizing the labels. It is realizing that these risks sit at the seams — between model and tool, prompt and policy, data and action, retrieval and trust. (Projeto de segurança de IA da OWASP Gen)

AI in Cyber Security
The AI stack is shipping ordinary bugs with extraordinary consequences
One of the healthiest changes in the conversation over the last year is that security engineers have stopped treating AI infrastructure as intellectually separate from software security. The recent CVE record around model-serving stacks makes the real lesson obvious: many AI systems fail in very familiar ways. Authentication bypass. Command injection. Unsafe deserialization. Unvalidated input. Dangerous defaults on exposed network sockets. The novelty is not the vulnerability class. The novelty is the blast radius when the vulnerable service sits in front of model management, tool execution, sensitive prompts, or distributed inference infrastructure. (NVD)
Consider Ollama MCP Server CVE-2025-15063. NVD describes it as a command-injection remote code execution vulnerability in the execAsync implementation, exploitable without authentication because user-supplied input was not properly validated before being used in a system call. There is nothing mystical about that bug class. It is the kind of issue security teams have dealt with for years. But in an AI context, the implications are broader because MCP-style systems often exist exactly to connect models to tools and execution paths. A bug there does not merely crash a UI. It can transform an orchestration layer into an attacker-controlled bridge. (NVD)
Ollama’s CVE-2025-63389 tells a different but equally important story. NVD describes a critical authentication-bypass issue in API endpoints before and including v0.12.3, where multiple model-management endpoints were exposed without required authentication, enabling unauthorized operations. Again, the root lesson is familiar: if your control plane is reachable and not properly authenticated, you do not have a control plane. In an AI setting, that can mean unauthorized model pulls, model changes, manipulation of serving behavior, or indirect access to broader infrastructure assumptions that teams mistakenly think are “internal.” (NVD)
The token-exposure issue tracked as CVE-2025-51471 reinforces how subtle implementation details can become security boundaries. NVD says Ollama 0.6.7 allowed cross-domain token exposure and access-control bypass through a malicious realm value in a WWW-Authenticate header returned by /api/pull. On paper, that sounds narrower than command injection. In practice, it is exactly the sort of authentication-surface bug that matters in environments where models, registries, and pull workflows form a trusted supply chain. If the chain is weak at the point where credentials are negotiated or reused, the model layer becomes another place for session abuse. (NVD)
The vLLM family of recent CVEs is even more revealing because it shows how performance-oriented AI infrastructure can inherit classic unsafe-deserialization and unsafe-loading problems. CVE-2025-32444 describes remote code execution in vLLM deployments using Mooncake because pickle-based serialization was used over unsecured ZeroMQ sockets listening on all network interfaces. CVE-2025-29783 covers a closely related Mooncake distributed-host RCE path through unsafe deserialization over ZMQ/TCP. CVE-2024-11041 documents RCE in MessageQueue.dequeue() due to direct pickle.loads on received socket data. None of these issues require a new theory of AI risk. They require remembering that “distributed inference stack” is still software, still exposed to hostile inputs, and still subject to the oldest rule in the book: do not deserialize untrusted data. (NVD)
CVE-2025-62164 makes the picture more modern. NVD says vLLM versions from 0.10.2 to before 0.11.1 contained a memory-corruption issue in the Completions API because user-supplied prompt embeddings were loaded with torch.load() without sufficient validation, and a PyTorch 2.8.0 behavior change around sparse tensor integrity checks made out-of-bounds writes possible, leading to denial of service and potentially remote code execution. This is a particularly valuable example because it shows how AI systems accumulate risk through the interaction of model-serving logic, framework defaults, and input pathways that security teams may not yet be monitoring with the same rigor as classic upload or RPC surfaces. (NVD)
The right takeaway from these CVEs is not “AI is insecure.” The right takeaway is that AI infrastructure deserves the same threat modeling, asset inventory, exposure review, patch discipline, and attack-surface reduction that any internet-reachable or privileged service deserves — and probably more. Government guidance on AI deployment security explicitly says to review the source of AI models and supply-chain security, validate the AI system before deployment, and enforce strict access controls and API security. Those recommendations read almost like a direct answer to the classes of bugs now showing up in model-serving ecosystems. (U.S. Department of War)
Here is a condensed CVE map that security teams can use for prioritization.
| CVE | Componente | Core issue | Security lesson |
|---|---|---|---|
| CVE-2025-15063 | Ollama MCP Server | Command injection, unauthenticated RCE | Tool-bridge layers need strict input validation and exposure control |
| CVE-2025-63389 | Ollama | Authentication bypass on API endpoints | Model-management planes must be authenticated and segmented |
| CVE-2025-51471 | Ollama | Cross-domain token exposure | Registry and pull workflows are identity boundaries, not convenience features |
| CVE-2025-32444 | vLLM with Mooncake | Pickle over unsecured ZeroMQ, RCE | Never trust internal-only assumptions in distributed inference |
| CVE-2025-29783 | vLLM with Mooncake | Unsafe deserialization over ZMQ/TCP | Model-serving clusters need network scoping and serialization hygiene |
| CVE-2024-11041 | vLLM | pickle.loads in MessageQueue, RCE | Queue and IPC paths need the same review as external APIs |
| CVE-2025-62164 | vLLM | Unvalidated torch.load() path, memory corruption, possible RCE | Embedding and model-input pathways are security-critical parsing surfaces |
That list is more than a patch queue. It is a design warning. If your organization is adopting local models, agent runtimes, retrieval systems, or distributed inference, “AI security” is not only about prompt injection. It is also about the very old and very real engineering work of closing control-plane exposure, eliminating unsafe deserialization, hardening internal services, and refusing to treat model-serving software as trusted simply because it is part of the AI stack. (NVD)
What a real AI security program looks like
The cleanest mistake organizations make is buying AI capability before assigning AI accountability. The 2024 joint government guidance on secure AI deployment says the person accountable for AI-system cybersecurity should be the same person accountable for the organization’s cybersecurity generally. That is a deceptively strong recommendation. It means the AI team does not get to run a shadow security model with separate assumptions, separate exceptions, and separate exposure decisions just because the workload is “innovative.” If the AI system touches sensitive data, external inputs, or automated actions, it belongs inside the same control discipline as the rest of the estate. (U.S. Department of War)
The second pillar is lifecycle threat modeling. NCSC’s secure-AI-development guidance makes threat modeling a core design activity, and the CISA-NSA deployment guidance tells organizations to require the primary developer to provide a threat model for the system. That matters because AI threats cross boundaries that normal application teams often split across different owners: data provenance, model provenance, retrieval quality, plugin trust, output handling, runtime permissions, and post-deployment monitoring. If nobody models those interactions, everyone assumes someone else is doing it. (NCSC)
The third pillar is access control. This sounds ordinary until you apply it to modern agentic systems. The least-privilege model for an AI assistant is almost never the least-privilege model for the person using it. If the agent can browse, run tools, retrieve internal documents, or call APIs, then the permissions it receives should be scoped explicitly for the workflow, not inherited casually from a broad user role. IBM’s 2025 breach data is especially useful here because it ties AI-related incidents to missing AI access controls and weak governance, while government deployment guidance explicitly tells organizations to enforce strict API security and defense in depth. (IBM)
Fourth comes data discipline. The 2025 AI Data Security guidance from NSA, FBI, CISA, and partners says data security spans the AI system lifecycle and warns that if an attacker can manipulate the data, the attacker can also manipulate the decision logic of an AI-based system. That is the correct mental model. Training data, fine-tuning data, retrieval corpora, prompt templates, tool outputs, evaluation datasets, and feedback loops are not passive inputs. They are parts of the system’s behavior. Security teams that protect the model weights while ignoring the retrieval store or prompt cache are protecting the shell and neglecting the nervous system. (U.S. Department of War)
The fifth pillar is validation before and after deployment. Government guidance says AI systems should be validated before deployment; NIST’s GenAI profile emphasizes pre-deployment testing and incident disclosure as primary considerations; MITRE’s Secure AI work highlights the need for verifiable AI vulnerability discovery and AI red teaming. Put together, those sources imply something straightforward: it is not enough to evaluate model quality, user delight, or latency. A real AI security program needs pre-release adversarial testing, post-release incident review, prompt-injection and output-handling tests, dependency review, exposure scanning, and continuous validation as models, connectors, and workflows change. (U.S. Department of War)
The sixth pillar is logging and observability that actually reflect the AI workflow. Traditional application logs are necessary, but they are not sufficient. Teams need to know which prompt or retrieved content produced a given action, which model version answered, which tool calls were attempted, which identity was used, which policy checks passed or failed, and what external content entered the context. IBM’s emphasis on visibility into all AI deployments, including shadow AI, and better observability for vulnerabilities and anomalies is important here because AI systems often fail through context corruption and permission misuse, not only through process crashes or obvious alerts. (IBM)
Seventh, organizations need to stop treating offensive validation as separate from AI governance. If a team is deploying AI agents that can act in business systems, then continuous validation is no longer a “red team luxury.” It becomes part of change control. The same way cloud teams learned that posture dashboards are not enough without attack-path testing, AI teams are learning that model evaluations are not enough without workflow-abuse testing. In that sense, AI security is forcing a merger between application security, identity security, data governance, and offensive validation.
The following operating model is a useful way to organize work.
| Program area | Minimum expectation | Strong expectation |
|---|---|---|
| Governance | AI inventory, assigned owner, policy coverage | Central review of agents, tools, connectors, and shadow AI |
| Identity | SSO and MFA for consoles, scoped tokens | Task-scoped permissions, non-human identity controls, passkeys where relevant |
| Dados | Classification and access controls | Provenance tracking, retrieval-store review, prompt and output retention policy |
| Supply chain | Dependency review and patching | Model registry controls, signed artifacts where possible, connector risk review |
| Validação | Pre-deployment test and config review | Adversarial testing, continuous retest, workflow abuse validation |
| Observability | App logs and API logs | Model versioning, prompt-action lineage, policy and tool-call audit trails |
That table is synthesis, but it is a direct extension of the control themes repeated by NIST, CISA-NSA guidance, IBM’s breach research, OWASP’s LLM risks, and MITRE’s threat-informed AI work. The point is not to invent a new bureaucracy around AI. It is to stop allowing AI to bypass the existing disciplines that already exist for high-risk systems. (NIST)

Detection content that security teams can actually use
The broad advice above only matters if it turns into controls and detections. For many teams, the fastest win is to start watching the AI control plane like any other sensitive management surface. That means model-pull endpoints, model-create routes, plugin or tool-registration changes, external calls from model-serving processes, and shell execution spawned by serving runtimes or orchestration layers.
A practical KQL example for spotting suspicious external access to AI management endpoints might look like this:
// Detect internet-sourced access to sensitive AI model management routes
AzureDiagnostics
| where RequestUri_s has_any ("/api/pull", "/api/create", "/api/delete", "/api/tags")
| where not(ipv4_is_in_any_range(ClientIP_s, dynamic([
"10.0.0.0/8",
"172.16.0.0/12",
"192.168.0.0/16"
])))
| summarize count(), FirstSeen=min(TimeGenerated), LastSeen=max(TimeGenerated)
by ClientIP_s, RequestUri_s, UserAgent_s, Resource, Host_s
| order by count_ desc
That query is intentionally simple. It does not “solve” AI security. It gives defenders a starting point for a class of risk that recent Ollama-related CVEs make painfully concrete: model-management endpoints and adjacent auth flows are sensitive. They should not be treated like harmless convenience APIs. The exact route names and log schema will vary, but the detection pattern is durable: isolate control-plane actions, distinguish internal from external origin, and investigate anything unusual quickly. The reasoning behind that detection is consistent with the recent NVD records for Ollama control-plane issues and the government guidance on strict access controls and API security for AI systems. (NVD)
For endpoint or EDR telemetry, a second useful pattern is process-spawn detection around model-serving software. In many environments, a model server should not suddenly start launching shells, downloaders, or script interpreters.
title: Suspicious Child Process Spawned by AI Serving Runtime
id: 4d89c0f0-f8e2-4b1d-8f41-ai-runtime-spawn
status: experimental
logsource:
category: process_creation
detection:
parent_image:
Image|endswith:
- '\ollama.exe'
- '\ollama'
- '\python.exe'
- '\python'
- '\vllm'
child_image:
Image|endswith:
- '\cmd.exe'
- '\powershell.exe'
- '\bash'
- '\sh'
- '\curl'
- '\wget'
condition: parent_image and child_image
fields:
- Image
- ParentImage
- CommandLine
- ParentCommandLine
level: high
This rule is also basic by design. You would tune it for your own runtime, packaging style, and expected behavior. The point is to acknowledge that once AI runtimes are in the environment, they deserve the same child-process scrutiny you already apply to office apps, web servers, scripting hosts, and remote-management tools. That mindset becomes more important, not less, as models gain tool use and orchestration capabilities.
Where offensive validation fits, and why it matters more in the AI era
The hardest lesson in modern security is that observability is not proof. A model can look aligned in testing and still be dangerous in a real workflow. A dashboard can show low critical-vulnerability counts while a high-value identity path remains exploitable. A GenAI assistant can pass functional QA and still be vulnerable to indirect prompt injection through retrieved content. That is why AI in cyber security is now pulling offensive validation closer to day-to-day engineering. Defenders need ways to verify not only whether a model answers safely, but whether the surrounding workflow resists abuse under real conditions. (Projeto de segurança de IA da OWASP Gen)
This is also where a platform like Penligent fits naturally. Penligent’s recent public technical writing keeps returning to one idea: the value of AI is not in summarizing vulnerabilities beautifully, but in handling the hard middle of security work — preserving context, reasoning about attack paths, validating exploitability, and producing evidence another engineer can reproduce. Its “AI in Security” and “AI Pentest Tool” articles both argue that the future belongs to systems that move from raw signal to defensible proof, not to scanners with a conversational wrapper. That is a fair way to describe the next phase of the market. (Penligente)
There is also a natural bridge between threat intelligence and offensive validation. Penligent’s article on threat-intelligence knowledge graphs with LLMs argues that LLM-driven graph construction can help connect threats, vulnerabilities, and TTPs faster, but also says that intelligence alone is not enough and needs to be paired with penetration testing and continuous validation. That logic is stronger than it may look at first glance. The more AI helps defenders compress raw information into hypotheses, the more important it becomes to test whether those hypotheses correspond to real, reachable risk in the environment. (Penligente)
The practical implication is simple. If your organization is using AI to prioritize risk, it should also use repeatable validation to prove which risks are reachable. If your organization is deploying AI agents that can act, it should test those workflows adversarially. And if your organization is buying AI for the SOC, it should make sure the output of that system connects to changes that can actually be verified, remediated, and retested.
What security leaders should do next
Most organizations do not need a moonshot AI-security program in the next quarter. They need a disciplined one. First, inventory every AI system that is deployed, piloted, or quietly used through shadow workflows. Include chat assistants, copilots, retrieval systems, code assistants, local model servers, inference clusters, browser agents, and anything that can call tools or touch sensitive data. IBM’s 2025 breach findings make clear that weak visibility and governance are still basic failure points. (IBM)
Second, separate “AI used for security” from “AI that must be secured.” Put them on different review tracks if necessary, but do not merge the risk assumptions. A SecOps assistant that reads logs has a different threat model from a model-serving cluster exposed to internal developers, and both are different from an autonomous agent with tool access into production systems. NIST’s Cyber AI Profile is helpful precisely because it forces this three-way distinction among AI-enabled defense, AI-enabled attack, and the security of AI systems themselves. (NCCoE)
Third, raise the standard for access control and network exposure around AI infrastructure immediately. Recent NVD records around Ollama and vLLM show why. Assume model-management APIs, orchestration layers, agent runtimes, and inference backplanes are security-critical services. Review which ones are reachable, how they authenticate, what they can pull, what they can execute, and which identities they inherit. (NVD)
Fourth, test the system that exists, not the model you intended to deploy. Prompt injection, excessive agency, system prompt leakage, and output-handling issues do not show up only in model benchmarks. They emerge when the model meets retrieval, tools, secrets, users, and production data. That is why OWASP’s 2025 risk list and NIST’s GenAI profile are still so useful: they anchor security work in how AI applications behave in context, not in abstract model capability alone. (Projeto de segurança de IA da OWASP Gen)
Finally, do not let AI become an excuse for abandoning fundamentals. Attackers are using AI. That is true. But the most damaging AI-assisted attacks still lean on exposed interfaces, poor authentication, credential theft, weak segmentation, unsafe software patterns, and incomplete monitoring. AI changes the tempo and the shape of the fight. It does not repeal security engineering.

