CVE-2025-4517 PoC Without Weaponizing It: Proving the Tarfile Extraction Boundary Breaks in Real Automation

Why this CVE shows up in real pipelines more than people expect

When engineers search for “cve-2025-4517 poc”, they’re rarely looking for a party trick. They’re trying to answer a very specific operational question:

“If someone hands my automation a tarball, can it make my process write files outside the directory I thought was safe?”

That is exactly the class of bug CVE-2025-4517 represents: an extraction-boundary failure in Python’s standard library tarfile module when extracting untrusted tar archives with TarFile.extractall() 또는 TarFile.extract() using the filter= parameter set to "data" 또는 "tar". The NVD record is explicit about the affected condition and version range: Python 3.12 or later are affected, because earlier versions do not include the extraction filter feature. (NVD)

If you’ve been around long enough, you’ll recognize the family resemblance: this is the “Zip Slip” idea applied to tar semantics, with modern twists like symlinks 그리고 realpath behavior. But the reason this one keeps appearing in serious conversations is not because it’s exotic. It’s because tar extraction is everywhere:

CI/CD jobs fetching and unpacking source artifacts
internal build tools that unpack plugin bundles
ML workflows unpacking datasets, model weights, or cached artifacts
dependency tools handling sdists and tar-based distributions
“agentic” automation that downloads, unpacks, and processes archives at speed

The security failure mode isn’t complicated: you thought “extracting to /tmp/job-123/” meant the archive contents could only land inside that directory. CVE-2025-4517 is about breaking that assumption under certain extraction patterns. (NVD)

CVE-2025-4517 PoC

AI 펜테스트 도구 체험 >>

What people actually click in SERPs for this CVE

You asked for “the highest-click-through terms on the web” and to read and synthesize viewpoints. We can’t see Google Search Console CTR numbers from here, but we can reliably infer the phrasing that dominates high-signal coverage and repeats across vendor databases and advisories. In practice, the titles and snippets that most often win clicks for engineers cluster around these patterns:

“Python tarfile path traversal” (clear class name, instantly scannable) (SentinelOne)
“arbitrary file write” (high-severity wording, immediately communicates impact) (Red Hat Customer Portal)
“TarFile.extractall filter=data/tar” (precise trigger; engineers click because it matches their code) (NVD)
“supply chain risk / CI/CD” (why it matters beyond a local script) (Linux Security)
“mitigation / safe extraction / how to check” (actionable, operational intent) (Gist)

Those phrases map cleanly to user intent. Most readers don’t want drama. They want a proof they can run safely, then a remediation plan that doesn’t break pipelines.

This article is written to match that intent: defensive validation, exposure triage, mitigation및 monitoring, without turning into an exploit drop.

Try AI Hacker Tool Free >>

Ground truth: what CVE-2025-4517 is and what it is not

What it is:

A vulnerability in CPython’s tarfile module where extracting an untrusted tar archive using TarFile.extractall() 또는 TarFile.extract() 와 함께 filter="data" 또는 filter="tar" can allow archive members to result in filesystem actions outside the intended destination directory (i.e., beyond the extraction boundary). The NVD summary and affected-condition text are the canonical baseline. (NVD)

Red Hat’s CVE entry characterizes the issue in the same practical terms: a flaw in CPython tarfile allowing writes outside the extraction directory when extracting untrusted archives. (Red Hat Customer Portal)

Google’s security advisory (GHSA) for a tarfile realpath-related issue describes the tested impact as allowing file reads and writes outside the destination path on Linux and macOS—consistent with the same extraction-boundary failure class engineers care about. (GitHub)

What it is not:

It is not automatically “remote code execution” by itself.
It does not magically grant privileges you don’t already have.
The impact is bounded by the permissions of the process doing the extraction.

But that “bounded by process permissions” line is exactly why it matters: your CI runner often has write permissions to places you really don’t want an attacker controlling.

The practical risk model: how boundary breaks become real incidents

Think in terms of what an attacker can do if they can cause controlled writes outside your extraction directory, as the same user your job is running under.

Here are realistic outcomes that do not require fantasy privileges:

Configuration poisoning Overwriting config files that your pipeline reads later in the same job. This can redirect outputs, alter build flags, or change “where artifacts are published.”
Workspace contamination Writing into your repo workspace or build cache to influence what gets built or tested.
Credential or token targeting Dropping files in locations that get automatically read, uploaded, or cached.
Persistence inside long-lived runners On shared or misconfigured runners, file writes can plant state that impacts subsequent jobs.

You don’t need to imagine a single-click RCE chain. In supply chain security, small primitives compose.

This is also why some vulnerability databases mention the supply-chain angle explicitly: tar archives are a packaging and distribution primitive, and many automation systems unpack them routinely. (Linux Security)

The defensive PoC philosophy: prove exposure, don’t publish a weapon

You asked for a PoC. Here’s the line we’ll hold:

We will build a benign archive that attempts to write a harmless marker file outside the destination directory using traversal-like member names.
We will extract into a temporary directory under a controlled test root.
We will only check whether a file ends up outside the intended directory—no payloads, no persistence locations, no sensitive paths, no bypass tricks.

That gives defenders what they need: a reproducible “yes/no” proof.

Defensive PoC: safe boundary-break validation script

What this does

Creates a test root like ./cve_2025_4517_lab/
Creates an extraction destination ./cve_2025_4517_lab/dest/
Creates a tarball containing a member name that attempts to escape into ./cve_2025_4517_lab/escaped/
Runs tarfile.extractall(..., filter="data") and checks whether the marker appears outside dest/

Note: The NVD record specifically calls out extraction with filter="data" 또는 filter="tar" as the affected condition. (NVD)

#!/usr/bin/env python3
"""
Defensive validation for CVE-2025-4517-style extraction boundary failures.

Goal:
- Safely test whether tar extraction can write outside the intended destination directory
  under the affected extraction patterns (extractall/extract + filter="data"/"tar").

This script:
- Uses a local lab directory only
- Writes a harmless marker file
- Avoids sensitive paths and any weaponization
"""

import os
import tarfile
from pathlib import Path

LAB = Path("cve_2025_4517_lab").resolve()
DEST = LAB / "dest"
ESCAPED = LAB / "escaped"
TAR_PATH = LAB / "test.tar"

MARKER_NAME = "../escaped/marker.txt"  # boundary escape attempt
MARKER_CONTENT = b"boundary-check\\n"

def reset_lab():
    for p in [DEST, ESCAPED]:
        p.mkdir(parents=True, exist_ok=True)
    if TAR_PATH.exists():
        TAR_PATH.unlink()
    marker = ESCAPED / "marker.txt"
    if marker.exists():
        marker.unlink()

def build_tar():
    # Build a tar with a traversal-like member name that tries to escape DEST into ESCAPED
    with tarfile.open(TAR_PATH, "w") as tf:
        ti = tarfile.TarInfo(name=MARKER_NAME)
        ti.size = len(MARKER_CONTENT)
        # Keep permissions boring
        ti.mode = 0o644
        tf.addfile(ti, fileobj=io.BytesIO(MARKER_CONTENT))

def extract_with_filter(filter_value: str):
    # Extract into DEST with a specified filter
    with tarfile.open(TAR_PATH, "r") as tf:
        tf.extractall(path=DEST, filter=filter_value)

def check_result():
    escaped_marker = ESCAPED / "marker.txt"
    in_dest = DEST / "marker.txt"
    return escaped_marker.exists(), in_dest.exists(), escaped_marker

if __name__ == "__main__":
    import io

    reset_lab()
    build_tar()

    results = {}
    for f in ["data", "tar"]:
        try:
            reset_lab()
            build_tar()
            extract_with_filter(f)
            escaped, in_dest, escaped_path = check_result()
            results[f] = {"escaped": escaped, "in_dest": in_dest, "escaped_path": str(escaped_path)}
        except Exception as e:
            results[f] = {"error": repr(e)}

    print("Lab root:", LAB)
    print("Results:")
    for k, v in results.items():
        print(f"  filter={k}: {v}")

How to interpret outputs

만약 escaped=True, you have direct evidence that the extraction boundary failed for that filter mode under your environment.
If an exception is raised, that may still be “good news” depending on the nature of the exception—some mitigations intentionally hard-fail on unsafe members.

This is the minimal “proof of risk” that satisfies most security reviews: a controlled file-write outside the intended directory.

Find it in your codebase: what to grep and what to review

Most teams discover exposure because an engineer greps their build tooling and finds tar extraction sprinkled everywhere. Start with this:

# Direct tarfile usage
rg -n "import\\s+tarfile|tarfile\\.open\\(|extractall\\(|extract\\(" .

# Focus on the affected trigger patterns called out by NVD
rg -n "extractall\\([^)]*filter\\s*=\\s*[\\"'](data|tar)[\\"']" .
rg -n "extract\\([^)]*filter\\s*=\\s*[\\"'](data|tar)[\\"']" .

Then classify each finding into one of these buckets:

Bucket	Typical location	Risk level	What to do
Untrusted input	CI artifact downloads, plugin bundles, “import dataset”, agent downloads	높음	Replace with safe extraction wrapper + sandbox/least privilege
Trusted internal artifacts	build outputs signed/attested, internal release system	Medium	Add signature/attestation checks + still use safe extraction
Local dev utilities	scripts run by engineers manually	낮음-중간	Still patch, but prioritize CI paths first

This is the “why engineers click ‘how to check if vulnerable’” part: the real risk is almost always in automation.

CVE-2025-4517 PoC Without Weaponizing It: Proving the Tarfile Extraction Boundary Breaks in Real Automation

Get One Click PoC >>

Mitigation: what actually fixes the class of bug

NVD’s record points engineers toward the tarfile extraction filter documentation and clarifies the affected condition. (NVD)

In practice, you want layered mitigation:

Upgrade Python / consume distro patches Track remediation via your platform vendor (e.g., Red Hat) or your base images. Red Hat’s CVE page is a common anchor for enterprise patch status. (Red Hat Customer Portal) Also expect scanners (e.g., Tenable/Nessus) to flag patched packages at the distro layer. (Tenable®)
Never extract untrusted tarballs with raw extractall Even if patched, archive extraction is a historically sharp edge. Make “safe extraction” a policy, not a one-off fix.
Use a safe extraction wrapper that enforces a realpath prefix check If you can’t upgrade immediately, this is your “band-aid that actually works.”

Here is a hardened extraction approach you can drop into automation. It denies members whose final resolved path escapes the destination:

from __future__ import annotations
import os
import tarfile
from pathlib import Path

class UnsafeTarMember(Exception):
    pass

def safe_extract_tar(tf: tarfile.TarFile, dest: str | os.PathLike) -> None:
    dest_path = Path(dest).resolve()

    for member in tf.getmembers():
        member_name = member.name

        # Reject absolute paths early
        if Path(member_name).is_absolute():
            raise UnsafeTarMember(f"absolute path not allowed: {member_name}")

        target_path = (dest_path / member_name).resolve()

        # Enforce that the extracted path stays under dest_path
        if not str(target_path).startswith(str(dest_path) + os.sep):
            raise UnsafeTarMember(f"path escape attempt: {member_name} -> {target_path}")

        # Optional: tighten on symlinks/hardlinks depending on your threat model
        if member.issym() or member.islnk():
            # Many real-world archive exploits rely on link tricks
            raise UnsafeTarMember(f"links not allowed: {member_name}")

    # If everything passes, extract
    tf.extractall(path=dest_path)

You’ll notice we also block symlinks/hardlinks by default. That’s opinionated, but it matches how teams treat untrusted archives in CI. If you need links for legitimate use, you can allow them after additional checks, but do it consciously.

Run extraction with least privilege Even a “file write outside dest” is much less scary if the process can’t write anywhere sensitive.

Monitoring: how to catch the bad outcomes you actually care about

A lot of security guidance stops at “upgrade and move on.” But in large orgs, you’ll want detection too—because the most realistic failure is “someone forgot one pipeline.”

모니터링 대상

Unexpected file writes outside designated working directories during build or import steps
Extraction errors that indicate blocked unsafe members (these become a signal of probing)
Creation of files in “should never change during build” locations

Here’s a simple operational pattern:

Define an allowlist of writable roots for CI jobs: workspace/, tmp/, cache/
Alert on writes outside these during steps that perform extraction

Minimal SIEM field mapping table

Telemetry source	Useful fields	What to flag
Endpoint process/file events	process name, command line, file path, parent process	Python process writing outside workspace shortly after reading a `.tar`
CI job logs	step name, artifact URL, extraction path, error text	“blocked member”, “path escape”, unexpected files created
Container runtime logs	mount points, writes to mounted volumes	Writes to mounted secrets/config volumes during extract steps

This kind of detection gives you confidence you’re not just “patched,” but actually safe in practice.

원클릭 PoC 체험 >>

Related CVEs and why you should treat archives as an attack surface

Archive parsing vulnerabilities are a repeating story. Even when the exact bug differs, the pattern is stable:

“The archive member name isn’t just a name; it’s a filesystem operation.”
“Links and path normalization can turn a safe-looking extraction into a write-where-you-want primitive.”

If your security program already has a “Zip Slip” mental model, CVE-2025-4517 is the tarfile version of the same lesson, made more relevant by modern automation.

For additional context on archive and extraction issues as a broad theme, the Alpha-Omega archive security paper discusses multiple archive-related vulnerabilities and patterns across ecosystems. (Alpha Omega)

If you treat CVE-2025-4517 as a pipeline risk rather than a one-off bug, the hard part becomes: “How do I prove I’m not exposed across dozens of repos and runners?”

That’s a workflow problem, not just a patch problem.

펜리전트 can plug into this kind of work in two non-forced ways:

First, during exposure triage, you can use an AI-assisted security workflow to inventory and prioritize where archive extraction happens in your automation—especially the code paths that match the NVD trigger pattern (extractall/extract 와 함께 filter="data" 또는 "tar"). The goal isn’t a vague “AI scan,” but a concrete set of findings you can hand to platform owners: where extraction occurs, what input trust boundary it assumes, and what privilege context it runs under.

Second, during defensive validation, you can standardize safe test harnesses like the boundary-check PoC above and run them against staging environments. Evidence-first reporting matters here: security reviews move faster when you can show “here is the file that landed outside the intended directory,” instead of arguing about hypotheticals.

Checklist you can hand to an on-call platform engineer

Identify Python 3.12+ usage in CI images and runners (start with base images and lockfiles). (NVD)
찾기 tarfile extraction call sites, especially extractall/extract 와 함께 filter="data" 또는 "tar". (NVD)
Run the defensive boundary-check PoC in a lab environment that mirrors CI.
Patch/upgrade via vendor channels where possible (track distro advisories). (Red Hat Customer Portal)
Add a safe extraction wrapper and deny links for untrusted archives.
Enforce least privilege and restrict writable directories in CI.
Add monitoring for out-of-workspace writes during extraction steps.

That sequence is what most teams actually do when they take this seriously.

참조

https://www.penligent.ai/hackinglabs/cve-2025-4517-poc-without-weaponizing-it-defensive-validation-patch-lines-and-the-tarfile-trap-inside-automation/ (펜리전트)

https://www.penligent.ai/hackinglabs/virustotal-in-incident-response-how-to-identify-malware-fast-and-pivot-without-leaking-data/ (펜리전트)

https://www.penligent.ai/hackinglabs/cve-2026-2441-the-chrome-css-zero-day-that-starts-inside-the-sandbox-and-rarely-ends-there/ (펜리전트)

https://nvd.nist.gov/vuln/detail/CVE-2025-4517 (NVD)

https://access.redhat.com/security/cve/cve-2025-4517 (Red Hat Customer Portal)

https://docs.python.org/3/library/tarfile.html#tarfile-extraction-filter (NVD)

https://github.com/google/security-research/security/advisories/GHSA-hgqp-3mmf-7h8f (GitHub)

https://gist.github.com/sethmlarson/52398e33eff261329a0180ac1d54f42f (Gist)

https://www.sentinelone.com/vulnerability-database/cve-2025-4517/ (SentinelOne)

게시물을 공유하세요:

CVE-2026-1731 Bomgar: the Remote Support bug you triage like identity infrastructure

CVE-2026-1731 is exactly the kind of vulnerability that turns that mental model into an incident. The NVD and CVE record

CVE 2026 2441 The Chrome CSS Zero Day That Demands Proof Not Promises

CVE-2026-2441 is a high-severity use-after-free vulnerability (CWE-416) in Chrome’s CSS component. The NVD description is direct: Chrome prior to 145.0.7632.75

CVE-2025-4517 PoC Without Weaponizing It: Proving the Tarfile Extraction Boundary Breaks in Real Automation

Why this CVE shows up in real pipelines more than people expect

What people actually click in SERPs for this CVE

Ground truth: what CVE-2025-4517 is and what it is not

The practical risk model: how boundary breaks become real incidents

The defensive PoC philosophy: prove exposure, don’t publish a weapon

Defensive PoC: safe boundary-break validation script

Find it in your codebase: what to grep and what to review

Mitigation: what actually fixes the class of bug

Monitoring: how to catch the bad outcomes you actually care about

Minimal SIEM field mapping table

Related CVEs and why you should treat archives as an attack surface

Checklist you can hand to an on-call platform engineer

참조

관련 게시물

CVE-2026-1731 Bomgar: the Remote Support bug you triage like identity infrastructure

CVE 2026 2441 The Chrome CSS Zero Day That Demands Proof Not Promises