ペンリジェント・ヘッダー

CVE-2026-4372, the Transformers Config Bug That Broke Model Loading

A model load should not silently become a shell.

That is the practical lesson behind CVE-2026-4372, a Hugging Face Transformers vulnerability involving malicious model configuration, internal attention implementation fields, and a code path that could execute attacker-controlled Python code even when the user did not intentionally opt into remote model code. NVD describes the issue as a critical remote code execution vulnerability in Transformers versions before 5.3.0, triggered when a victim loads a crafted model with APIs such as AutoModelForCausalLM.from_pretrained(). The malicious repository can use config.json to set _attn_implementation_internal to an attacker-controlled Hugging Face Hub repository, causing the library to download and execute code from that repository with the victim process’s privileges. NVD also states that this path bypasses the intended protection of trust_remote_code.(NVD)

The wording matters. This is not just another “do not run untrusted code” warning. Hugging Face already has an explicit flag for that boundary. Transformers documentation says trust_remote_code defaults to False, and that users should only set it to True for repositories they trust and whose code they have read, because code from the Hub will execute locally.(Hugging Face) CVE-2026-4372 is serious because it undermined that mental model. A user could avoid enabling remote code and still hit a path where configuration influenced code loading.

The safer way to think about the issue is this: machine learning model repositories are software supply-chain inputs, not passive data blobs. A repository can contain weights, configuration, tokenizer files, custom Python modules, model cards, and metadata that changes how libraries behave. In modern AI systems, model loading often happens in CI jobs, notebooks, evaluation workers, fine-tuning pipelines, inference servers, and internal experimentation platforms. If those environments have API tokens, cloud credentials, mounted datasets, build secrets, or production network access, the difference between “load a model” and “execute attacker code” is operationally huge.

The core facts

項目Conservative reading
CVECVE-2026-4372
Affected projectHugging Face Transformers
Primary vulnerable behavior悪意がある config.json can influence internal kernel loading and lead to Python code execution
Key field called out by NVD_attn_implementation_internal
API path called out by NVDAutoModelForCausalLM.from_pretrained()
Security boundary affectedThe expected protection of trust_remote_code=False
FixUpgrade to Transformers 5.3.0 or later
NVD statusNIST had not yet provided its own CVSS score when the NVD entry was captured, while the CNA score listed by NVD was 7.8 High
Weakness categories listed by NVDCWE-502 and CWE-1066
Publication timelineNVD lists the CVE as published on May 24, 2026 and last modified on June 4, 2026

NVD’s affected version range is broad: versions before 5.3.0 are marked as vulnerable. Pluto Security’s technical analysis narrows the practical exploitation conditions, stating that the vulnerable path existed from Transformers 4.56.0 through the 5.2.x line and required the kernels package to be installed. That distinction is important for triage. Treat the official version boundary as the patching rule. Use the narrower technical condition to prioritize investigation and exposure review, not to delay upgrading.(NVD)

Why model loading is a security boundary

Transformers made model use simple for good reasons. The normal developer experience is intentionally clean: pick a model identifier, call from_pretrained(), let the library download the model weights and configuration, and start inference or fine-tuning. Hugging Face documentation describes from_pretrained() as the standard way to load pretrained weights and configuration from the Hub.(Hugging Face)

That convenience is also why the boundary is security-sensitive. Loading a model may involve more than parsing static tensors. It can pull files from a remote repository, deserialize metadata, choose architecture classes, initialize model components, select attention implementations, import custom modules, and cache artifacts locally. Each of those steps can become part of the trust chain.

The difference between a safe artifact and an executable software input is not obvious to every developer who touches AI workflows. A data scientist may see a model ID in a notebook and think the risk is mainly model quality. A platform engineer may see an evaluation job and think the risk is cost. A red teamer sees something else: a remote identifier passed into a loader that can change runtime behavior.

Hugging Face’s own documentation acknowledges this boundary. For custom models, the docs warn users to take extra precautions, note that Hub files are malware-scanned, and still tell users to avoid executing malicious code. They also recommend pinning a specific revision as an added security measure when loading models with custom code.(Hugging Face)

That advice remains correct, but CVE-2026-4372 shows why a single flag cannot be the only guardrail. A model-loading security design has to assume the configuration plane itself can be hostile. It should validate keys, constrain internal dispatch decisions, restrict sources, limit network access, and run with minimal secrets.

What made CVE-2026-4372 different

Many malicious model risks depend on a user explicitly agreeing to run model repository code. In the ordinary custom-code workflow, a user sets trust_remote_code=True, and that flag should represent a conscious decision to run Python code from a model repository.

CVE-2026-4372 is different because the attack path did not depend on that explicit decision in the way users would expect. NVD states that a crafted config.json could cause arbitrary Python code to be downloaded and executed from an attacker-controlled Hugging Face Hub repository while bypassing trust_remote_code.(NVD)

That breaks an important security promise at the usability layer. Security flags are not only technical controls; they are also communication tools. They tell developers when a behavior crosses a trust boundary. If a developer sees no trust_remote_code=True in a codebase, they may assume the pipeline never runs remote model code. CVE-2026-4372 shows why that assumption was not safe for affected Transformers versions.

The issue sits at the intersection of three design areas:

Design areaなぜそれが重要なのか
Configuration deserializationconfig.json controls model behavior and can influence internal attributes
Kernel dispatchHigh-performance model paths can involve specialized kernels and dynamic loading logic
Trust signalingUsers rely on trust_remote_code to distinguish static model loading from remote code execution

A vulnerability at this intersection is easy to underestimate. It does not look like a classic web RCE. There is no exposed /admin endpoint, no SQL injection, no obvious reverse shell in the vulnerable project. Instead, the exploit path hides in a common AI workflow: “load this model and run evaluation.”

How the vulnerable path worked

How a Malicious config.json Can Trigger Code Execution

The vulnerable behavior can be understood without turning it into a copy-paste exploit.

First, a malicious actor prepares a model repository. The important object is not necessarily the model weights. The important object is the configuration file. NVD specifically calls out the malicious use of config.json そして _attn_implementation_internal field.(NVD)

Second, a victim loads the model using a normal Transformers API. NVD names AutoModelForCausalLM.from_pretrained() as the relevant path. In real environments, the same pattern can appear inside notebooks, evaluation scripts, internal model registries, training pipelines, or third-party tools that wrap Transformers.(NVD)

Third, affected Transformers versions deserialize configuration in a way that lets the hostile field influence internal model behavior. The GitHub patch for the issue is revealing because it shows exactly what changed. Before the fix, configuration keyword arguments were generically applied to the configuration object. The patch adds a guard that avoids deserializing specific problematic internal fields, including _attn_implementation_internal そして _experts_implementation_internal.(ギットハブ)

Fourth, the model-loading flow reaches kernel-loading logic. The patch also changed the kernel repository trust logic. The old matching behavior accepted a broader pattern for kernel repositories; the patched code restricts default trust to kernels-community and requires trust_remote_code for other repositories. A maintainer comment in the pull request states that the project kept default trust limited to its own published kernels and allowed others only with trust_remote_code.(ギットハブ)

The high-level chain looks like this:

ステージAttacker-controlled inputVictim actionSecurity impact
Model selectionA malicious or compromised model repositoryVictim chooses or automation pulls a model IDThe remote repository enters the trust boundary
Configuration loadCrafted config.jsonTransformers parses model configurationInternal attributes may be set from hostile metadata
Dispatch decision_attn_implementation_internal valueLoader selects an implementation pathCode-loading behavior can be redirected
Kernel retrievalAttacker-controlled repository referenceLibrary downloads implementation codePython code enters the local runtime
実行Python module codeVictim process imports or executes itCode runs with local process privileges

This is why the vulnerability is best treated as a model supply-chain execution bug. The attacker does not need to exploit memory corruption in the victim’s CPU or GPU stack. The attacker abuses trusted automation around model retrieval and initialization.

Why the CVSS vector can look less dramatic than the real workflow risk

NVD’s entry lists the CNA vector as CVSS:3.0/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H, with a 7.8 High score. The local attack vector and required user interaction can make the issue sound less urgent than a network-exposed unauthenticated service RCE.(NVD)

For AI teams, that reading can be misleading. In a modern model pipeline, “user interaction” may be an automated process accepting a model identifier from a queue, a benchmark list, a pull request, a notebook cell, or a model registry update. “Local attack vector” may still mean attacker-controlled code runs inside a cloud-hosted GPU worker with access to tokens, datasets, internal endpoints, and expensive compute.

The risk depends less on the CVSS label and more on how model loading is wired into the organization.

A local research laptop with no secrets, no sensitive datasets, and only pinned internal models has a very different risk profile from a shared evaluation platform that automatically pulls arbitrary Hugging Face models submitted by users. A CI job that tests third-party models with long-lived cloud credentials is more dangerous than a locked-down container with no outbound network and no mounted secrets.

Security teams should ask four questions:

Questionなぜそれが重要なのか
Can untrusted users influence model IDs or repository URLs?Turns a dependency bug into an attacker-reachable workflow
Does the model-loading environment contain secrets?Determines post-execution blast radius
Is outbound network access allowed?Enables second-stage payloads, exfiltration, and command retrieval
Are model loads logged with repository, commit, and version data?Determines whether investigation is possible after patching

The presence of Transformers alone does not mean compromise. The combination of old Transformers, risky model intake, sensitive execution context, and weak logging is what creates serious exposure.

Affected conditions and how to prioritize

The safest remediation rule is simple: upgrade Transformers to 5.3.0 or later anywhere the library is installed. Hugging Face’s v5.3.0 release notes list a kernel-related security fix and reference the PR that fixed the issue.(ギットハブ)

Prioritization is still necessary because large organizations may have Transformers in many places: production inference services, developer laptops, Docker images, notebooks, Airflow jobs, benchmark harnesses, internal platforms, and research repositories. Some installations load only fixed internal artifacts. Others pull from public model hubs on every run.

Use this table to triage:

EnvironmentExposure levelなぜ
Public-facing service that accepts user-provided model IDsクリティカルAttackers may be able to influence the exact model loaded
Shared model evaluation platform pulling public Hub repositories高いModel intake is part of normal workflow
CI pipeline that benchmarks external model repositories高いAutomation may load untrusted models without human inspection
Notebook environment with cloud tokens and old Transformers高いManual model loading can still expose valuable credentials
Production inference service using one internal pinned modelミディアムRisk depends on whether the model artifact path can change
Offline research environment with no secrets and fixed model cacheより低いExecution impact is constrained, but still patch
Patched environment on Transformers 5.3.0 or laterより低いKnown vulnerable path is fixed, but general model supply-chain risk remains

Pluto Security’s analysis adds a useful nuance: the vulnerable path it analyzed required the kernels package to be installed and was introduced in the 4.56.0 line.(Pluto Security) Do not use that detail as an excuse to keep old versions. Use it to decide which logs, images, and environments deserve immediate inspection.

Detection and validation without running the model

Safe Validation Workflow for CVE-2026-4372

Do not validate CVE-2026-4372 by loading a suspicious model. That repeats the dangerous action. Start with dependency inventory, static model artifact checks, cache inspection, code search, and runtime telemetry.

Check installed Transformers and related packages

Run this inside each relevant Python environment:

python - <<'PY'
import importlib.metadata as md

for package in ["transformers", "kernels", "huggingface_hub", "torch"]:
    try:
        print(f"{package}=={md.version(package)}")
    except md.PackageNotFoundError:
        print(f"{package}: not installed")
PY

For pip-based environments:

python -m pip show transformers kernels huggingface_hub torch
python -m pip freeze | grep -E '^(transformers|kernels|huggingface-hub|torch)=='

For Conda environments:

conda list | grep -E 'transformers|kernels|huggingface_hub|pytorch'

For containers, inspect both the build file and the final image. Many vulnerable environments survive because the source repository was patched but a long-lived Docker image still contains an old dependency.

docker run --rm your-image:tag python - <<'PY'
import importlib.metadata as md
for package in ["transformers", "kernels"]:
    try:
        print(package, md.version(package))
    except md.PackageNotFoundError:
        print(package, "not installed")
PY

If you use lockfiles, search them directly:

grep -R "transformers==" requirements*.txt pyproject.toml poetry.lock Pipfile.lock conda*.yml .

This is not enough by itself. Model-loading code is often embedded in notebooks, scripts, internal platforms, or third-party tools. The dependency check tells you where the vulnerable library may exist. It does not tell you whether risky models were loaded.

Search for model-loading call sites

Search for common load patterns:

grep -R --line-number \
  -E "from_pretrained\\(|AutoModel|AutoModelForCausalLM|AutoTokenizer|pipeline\\(" \
  ./src ./notebooks ./scripts 2>/dev/null

For Python projects, a small AST-based scan reduces noise:

import ast
from pathlib import Path

TARGET_NAMES = {
    "from_pretrained",
    "AutoModel",
    "AutoModelForCausalLM",
    "AutoTokenizer",
    "pipeline",
}

for path in Path(".").rglob("*.py"):
    try:
        tree = ast.parse(path.read_text(encoding="utf-8"))
    except Exception:
        continue

    for node in ast.walk(tree):
        if isinstance(node, ast.Call):
            name = ""
            if isinstance(node.func, ast.Attribute):
                name = node.func.attr
            elif isinstance(node.func, ast.Name):
                name = node.func.id

            if name in TARGET_NAMES:
                print(f"{path}:{node.lineno}: call to {name}")

This does not prove exploitation. It gives you a map of places where model-loading behavior exists and where input sources should be reviewed.

Scan model repositories and caches for dangerous configuration keys

The most useful defensive static check is to look for the internal fields that the patch restricted. The GitHub fix explicitly avoids deserializing _attn_implementation_internal そして _experts_implementation_internal, which makes them high-value indicators during review.(ギットハブ)

A quick shell scan:

grep -R --line-number \
  -E '"_attn_implementation_internal"|"_experts_implementation_internal"' \
  ~/.cache/huggingface ./models ./checkpoints 2>/dev/null

A safer JSON-aware scanner:

import json
from pathlib import Path

RISKY_KEYS = {
    "_attn_implementation_internal",
    "_experts_implementation_internal",
}

SEARCH_ROOTS = [
    Path.home() / ".cache" / "huggingface",
    Path("./models"),
    Path("./checkpoints"),
]

def scan_config(path: Path):
    try:
        data = json.loads(path.read_text(encoding="utf-8"))
    except Exception:
        return

    found = sorted(RISKY_KEYS.intersection(data.keys()))
    if found:
        print(f"[!] {path}")
        print(f"    risky keys: {', '.join(found)}")

for root in SEARCH_ROOTS:
    if not root.exists():
        continue
    for config in root.rglob("config.json"):
        scan_config(config)

This script intentionally does not import Transformers, load the model, execute repository code, or resolve remote references. It treats configuration as untrusted input and reads it as JSON only.

Review downloaded repository provenance

A model cache path alone is not enough. You need model ID, revision, download time, and the process that initiated the load. Hugging Face caches can preserve repository-like directory names, but enterprise teams should not depend on cache naming as their only audit trail. Build explicit logging into model intake:

from datetime import datetime, timezone
from transformers import AutoConfig

def logged_config_load(model_id: str, revision: str | None = None):
    print({
        "event": "model_config_load",
        "model_id": model_id,
        "revision": revision,
        "time": datetime.now(timezone.utc).isoformat(),
    })

    return AutoConfig.from_pretrained(
        model_id,
        revision=revision,
        trust_remote_code=False,
    )

This example loads only configuration, not the model. In production, the logging should happen before any remote model artifact is fetched, and the model ID should come from an allowlist or approval workflow, not raw user input.

Add a CI gate for vulnerable versions

A build should fail if it resolves Transformers below your minimum accepted version.

from importlib.metadata import version, PackageNotFoundError
from packaging.version import Version

MIN_TRANSFORMERS = Version("5.3.0")

try:
    installed = Version(version("transformers"))
except PackageNotFoundError:
    raise SystemExit("transformers is not installed")

if installed < MIN_TRANSFORMERS:
    raise SystemExit(
        f"Blocked: transformers {installed} is below required {MIN_TRANSFORMERS}"
    )

print(f"OK: transformers {installed}")

Wire it into CI:

name: dependency-security-check

on:
  pull_request:
  push:

jobs:
  check-transformers:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: python -m pip install -r requirements.txt
      - run: python security/check_transformers_version.py

Use this as a guardrail, not the only control. A repository can pass CI while an old notebook server, Docker image, or scheduled job remains exposed.

Runtime signals worth hunting

If an environment loaded untrusted or semi-trusted models before patching, static checks are not enough. You need to look for signs of unexpected code retrieval and execution.

信号なぜそれが重要なのか制限
Python process connects to an unexpected Hugging Face repositoryThe vulnerable path involves remote repository retrievalNormal model loading also contacts the Hub
New files appear under Hugging Face cache after a suspicious jobIndicates repository content was downloadedCache may be cleaned or shared across jobs
Python imports modules from cache pathsMay show code execution from downloaded artifactsRequires endpoint telemetry or audit hooks
Outbound connections after model loadMay indicate second-stage payload retrieval or exfiltrationMany ML jobs legitimately download dependencies
Cloud API calls from notebook or GPU worker identitiesCredentials may have been exposed to executed codeRequires cloud audit logs
New processes spawned by Python during model loadMay indicate post-execution activityNot all payloads spawn child processes
Unexpected reads of environment variables or credential filesMalicious code often harvests secrets firstRequires EDR, auditd, or language-level instrumentation

On Linux, teams with auditd can monitor sensitive credential paths in shared ML hosts:

sudo auditctl -w /home -p r -k home_read_watch
sudo auditctl -w /var/run/secrets -p r -k container_secret_watch
sudo auditctl -w /root/.aws -p r -k aws_credential_watch

This is noisy and should be tuned. It is more useful in a short investigation window than as a permanent broad rule.

For containerized model loading, log network egress at the job or namespace level. If you cannot explain why a model evaluation job contacted a domain after loading a model, treat it as suspicious until proven otherwise.

修復

The first fix is dependency upgrade. Install Transformers 5.3.0 or newer across every environment that can load models. NVD explicitly recommends upgrading to 5.3.0 or later, and the v5.3.0 release references a kernel-related security fix.(NVD)

python -m pip install --upgrade "transformers>=5.3.0"

For pinned projects:

transformers>=5.3.0

For Poetry:

poetry add "transformers>=5.3.0"
poetry lock

For Conda, use the channel and package strategy your organization already trusts, then verify the installed version from inside the environment:

python - <<'PY'
import transformers
print(transformers.__version__)
PY

Patch the dependency, rebuild containers, restart long-running workers, and clear stale runtime environments. A notebook server that keeps an old kernel alive can remain exposed even after the repository lockfile is updated.

Patch first, then reduce blast radius

Upgrading fixes the known vulnerable code path. It does not solve every model supply-chain risk. The next step is to make model loading less powerful.

A safer model-loading environment should have:

コントロールゴール
No long-lived secrets in environment variablesPrevent immediate credential theft
No production cloud role attached to evaluation jobsReduce impact of arbitrary code execution
Read-only dataset mounts when possibleLimit tampering and data destruction
No write access to source repositoriesPrevent supply-chain persistence
Restricted outbound networkBlock second-stage downloads and exfiltration
Model allowlistPrevent arbitrary repository selection
Revision pinningPrevent silent changes under a trusted model name
Separate cache per trust zoneAvoid cross-contamination between untrusted and trusted jobs

Hugging Face documentation encourages pinning a specific revision when loading custom models, which is good practice beyond this CVE.(Hugging Face) A model name by itself is a moving pointer. A commit hash is a more stable artifact reference.

A controlled loading wrapper should reject raw user-provided model IDs unless they match an approved pattern:

from dataclasses import dataclass

@dataclass(frozen=True)
class ApprovedModel:
    model_id: str
    revision: str

APPROVED_MODELS = {
    "internal/llm-eval-baseline": ApprovedModel(
        model_id="internal/llm-eval-baseline",
        revision="0123456789abcdef0123456789abcdef01234567",
    ),
    "vendor/safety-classifier": ApprovedModel(
        model_id="vendor/safety-classifier",
        revision="abcdef0123456789abcdef0123456789abcdef01",
    ),
}

def resolve_model(user_choice: str) -> ApprovedModel:
    try:
        return APPROVED_MODELS[user_choice]
    except KeyError:
        raise ValueError(f"Model is not approved: {user_choice}")

Then use the approved model object:

from transformers import AutoModelForCausalLM, AutoTokenizer

approved = resolve_model("internal/llm-eval-baseline")

tokenizer = AutoTokenizer.from_pretrained(
    approved.model_id,
    revision=approved.revision,
    trust_remote_code=False,
)

model = AutoModelForCausalLM.from_pretrained(
    approved.model_id,
    revision=approved.revision,
    trust_remote_code=False,
)

The wrapper should live in a shared internal package, not as copy-pasted notebook code. If every team creates its own loader, policy drift is guaranteed.

Do not treat Safetensors as a complete fix

Safetensors is valuable. Hugging Face documentation describes Safetensors as a safer and faster alternative where available, and the Hub security documentation explains why pickle-based formats are dangerous: Python pickle can execute arbitrary code during loading through imports, builtins, and object construction behavior.(Hugging Face)

But CVE-2026-4372 is not simply “pickle is unsafe.” The issue involves configuration-driven control flow and kernel-loading behavior. A model can use Safetensors for weights and still include hostile configuration or other risky repository content. Safetensors reduces one class of deserialization risk. It does not make a model repository inert.

A practical policy should say:

ArtifactSecurity rule
WeightsPrefer Safetensors over pickle-based formats
ConfigurationValidate keys and reject unexpected internal fields
Custom codeRequire explicit approval, code review, and pinned revision
TokenizersTreat tokenizer files and supporting code as part of the artifact
Repository metadataLog source, revision, author, and approval status
RuntimeIsolate model loading even for apparently safe artifacts

This is the model-supply-chain equivalent of saying a signed package is better than an unsigned one, but a signed package should still be installed in a controlled environment with least privilege.

Secure model intake for teams that pull from the Hub

Secure Model Intake Pipeline for AI Teams

Many AI teams cannot simply ban public models. Research, benchmarking, red teaming, fine-tuning, and compatibility testing often require external model intake. The answer is not to stop using the Hub. The answer is to make model intake explicit.

A secure intake process should include:

ステップRequired evidence
リクエストWho requested the model, why it is needed, and where it will run
Source reviewModel ID, author, repository age, files present, and whether custom code exists
Revision pinExact commit hash selected for testing
Static scanConfiguration keys, file types, pickle presence, custom Python files
Sandbox testLoad in an isolated environment with no secrets and controlled egress
ApprovalHuman review for use beyond sandbox
PromotionCopy approved artifact into an internal registry or cache
RetestRecheck after dependency upgrades or model revision changes

Hugging Face Hub provides security features such as malware scanning, token controls, commit signing support, and other repository-level protections. Its documentation also states that files are scanned with ClamAV and that security scanning is not a substitute for user responsibility.(Hugging Face)

That split responsibility is the right model. The platform can scan and flag known problems. The consuming organization still has to decide which artifacts to run, where to run them, and what credentials are exposed.

A minimal sandbox runner might look like this:

docker run --rm \
  --network none \
  --read-only \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  -e HF_HOME=/tmp/hf-cache \
  -v "$PWD/model-test:/work:ro" \
  python:3.11-slim \
  python /work/static_model_review.py

For a real model load test, you may need network access to fetch artifacts. In that case, do the fetch step separately, store the artifact by digest or pinned revision, then test offline. Avoid giving the loading process direct access to cloud metadata services, production databases, SSH keys, package publishing tokens, or internal control planes.

What to do if an untrusted model was loaded before patching

If an exposed environment loaded untrusted or unknown model repositories before the upgrade, treat it as a potential code execution event. The response should be proportionate, but it should not be casual.

Start with containment:

1. Stop the affected notebook, worker, or service.
2. Preserve logs and cache directories before cleanup.
3. Record installed package versions and container image digests.
4. Identify model IDs and revisions loaded during the exposure window.
5. Rotate credentials available to the process.
6. Review cloud, network, and endpoint logs for post-load activity.
7. Rebuild the environment from a clean patched base image.

Do not start by deleting everything. Deleting cache directories before preserving evidence may destroy the only record of which repositories were loaded.

Prioritize credential rotation based on what the process could access:

Credential typeWhy rotate
Hugging Face tokensMalicious code may access private models or publish artifacts
Cloud access keysGPU workers often run with storage or compute permissions
GitHub tokensNotebooks and CI jobs often pull private repositories
Package registry tokensBuild environments may have publish rights
Database credentialsEvaluation jobs sometimes access internal datasets
Slack or webhook tokensUseful for data exfiltration and persistence

If the environment had a cloud instance role, inspect cloud audit logs for unusual API calls. If it had access to object storage, check reads and writes. If it ran in Kubernetes, inspect service account permissions, pod logs, network policy, and mounted secrets.

The absence of obvious malware files is not proof of safety. A payload can read environment variables and exfiltrate them without persistence. A payload can run only in memory. A payload can trigger a single outbound request and exit. Incident review should focus on what the process could do, not only what files remain.

Building a safer loader policy

A mature loader policy has three parts: input control, execution control, and evidence.

Input control answers: which model can be loaded?

Execution control answers: what can the loading process access?

Evidence answers: can we reconstruct what happened?

A practical policy can be expressed as code and infrastructure:

model_security_policy:
  minimum_transformers_version: "5.3.0"
  allow_public_hub_models: false
  require_revision_pin: true
  require_static_config_scan: true
  reject_config_keys:
    - "_attn_implementation_internal"
    - "_experts_implementation_internal"
  allow_pickle_weights: false
  require_safetensors_when_available: true
  require_custom_code_review: true
  runtime:
    network: "restricted"
    secrets: "none"
    filesystem: "read-only where possible"
    cache_scope: "per trust zone"
  logging:
    model_id: true
    revision: true
    caller: true
    dependency_versions: true
    container_digest: true

This kind of policy is easier to enforce if model loading is centralized. If every team directly calls from_pretrained() from notebooks and scripts, security becomes advisory. If teams use an internal loader package or model gateway, controls can become defaults.

For organizations validating AI-adjacent attack surfaces, automated security workflows are useful when they preserve scope, evidence, and human approval. A platform such as Penligent can sit in that controlled validation layer for dependency inventory, retesting, evidence capture, and report generation around issues like model-loading RCE, while patching and isolation remain the primary controls. Penligent’s related technical write-up on NVIDIA Merlin RCE and SafeTensors is relevant because it treats ML artifacts as software supply-chain inputs rather than inert files.(寡黙)

The key is not automation for its own sake. The key is repeatability. CVE-2026-4372 is the kind of issue that appears in many places at once: notebooks, old Docker images, research scripts, CI runners, batch jobs, and internal tools. A manual one-time check misses stragglers. A controlled repeatable workflow finds them again after the first fix.

Related CVEs and incidents that clarify the pattern

CVE-2026-4372 is part of a broader class of AI supply-chain failures where model or framework loading crosses into code execution. Two comparisons are especially useful.

CVE-2025-32434, when a safety flag was not enough

PyTorch CVE-2025-32434 is relevant because it also involved a model-loading safety expectation that did not hold. The official PyTorch advisory describes a critical issue where トーチロードweights_only=True could still lead to remote code execution, affecting PyTorch 2.5.1 and earlier and fixed in 2.6.0.(ギットハブ)

The parallel is not that the bugs are identical. The parallel is the failure mode: a user-facing safety control was expected to prevent arbitrary code execution, but a vulnerable loading path still allowed it. In PyTorch, the issue centered on unsafe loading behavior around serialized artifacts. In CVE-2026-4372, the issue involved Transformers configuration and kernel-loading behavior. Both teach the same defensive lesson: “safe mode” flags reduce risk only if every path behind them preserves the intended boundary.

脆弱性コンポーネントSafety expectationFailure patternPrimary fix
CVE-2026-4372Hugging Face Transformerstrust_remote_code=False should prevent remote model code executionMalicious configuration influenced internal code-loading behaviorUpgrade Transformers to 5.3.0+
CVE-2025-32434PyTorchweights_only=True should constrain unsafe loadingトーチロード path could still result in code executionUpgrade PyTorch to 2.6.0+

Security teams should use both cases to update review checklists. Do not only ask, “Did the developer set the safe flag?” Ask, “Does the installed version actually enforce the safe flag across all relevant loading paths?”

The Open OSS privacy filter incident, a different path to the same supply-chain problem

In May 2026, HiddenLayer reported a malicious Hugging Face model repository that posed as an OpenAI privacy filter project and shipped a loader intended to fetch and execute infostealer malware on Windows systems. The incident was not the same mechanism as CVE-2026-4372, but it shows the same ecosystem risk: model repositories can be used to distribute executable behavior under the appearance of ordinary AI artifacts.(hiddenlayer.com)

The difference matters. CVE-2026-4372 was a vulnerability in how Transformers handled malicious configuration and kernel-loading logic. The privacy-filter incident was a malicious repository abusing user trust and repository presentation. One is a framework bug. The other is supply-chain deception. Both hit the same operational point: teams that pull external models need intake controls, sandboxing, revision pinning, and logging.

That is why model security should not be delegated entirely to framework flags or platform scanning. Those controls help, but they do not replace basic software supply-chain discipline.

Common mistakes that leave teams exposed

Patching the app but not the notebook fleet

Many organizations patch production services first and leave research infrastructure behind. That is understandable but dangerous. Notebook environments often have broader secrets than production services because they are used for experimentation. They may include cloud credentials, GitHub tokens, private dataset access, and writable storage.

If a notebook server loaded public models with an old Transformers version, include it in the exposure review.

Checking only source repositories

Dependency state lives in more places than Git. Docker images, Conda environments, cached virtualenvs, long-running Kubernetes pods, managed notebook images, and old CI runners can preserve vulnerable versions.

A good fix ticket should not say “updated requirements.txt.” It should say:

Updated dependency lockfile.
Rebuilt runtime image.
Verified installed version inside final image.
Restarted workers.
Checked notebooks and batch jobs.
Scanned model caches.
Reviewed logs for untrusted model loads.

Assuming trust_remote_code=False ends the conversation

trust_remote_code=False remains an important default, but CVE-2026-4372 exists precisely because the intended boundary was bypassed in affected versions. Keep the flag. Upgrade the library. Add source controls and sandboxing.

Confusing weight safety with repository safety

Using Safetensors is a good control for weight deserialization risk. It does not validate config.json, repository Python files, tokenizer behavior, dynamic loading logic, or the environment in which the model runs.

Letting model IDs come from untrusted input

Any service that accepts a model ID from a user and passes it directly into from_pretrained() should be treated as a high-risk design. This includes internal tools. Internal users can make mistakes, and internal systems can be compromised.

Use allowlists, approval workflows, and revision pins.

Running suspicious models to see what happens

Do not test an unknown model by loading it in your normal environment. That is the behavior the attacker wants. Static review comes first. If execution is necessary, use an isolated sandbox with no secrets, controlled network access, and disposable infrastructure.

実践的ハードニング・チェックリスト

The following checklist is intentionally operational. It is written for teams that need to close the issue, not just understand it.

優先順位アクションEvidence to keep
ImmediateUpgrade Transformers to 5.3.0 or laterPackage version output from every runtime
ImmediateIdentify installations with kernels installedEnvironment inventory
ImmediateSearch for from_pretrained() call sitesRepository scan results
ImmediateScan model caches for risky internal config keysScanner output and reviewed paths
高いRebuild Docker images and notebook basesImage digests and build logs
高いRotate credentials exposed to untrusted model-loading jobsRotation tickets and audit notes
高いRestrict model loading to approved IDs and pinned revisionsPolicy and allowlist
ミディアムAdd CI gate for vulnerable dependency versionsCI logs
ミディアムAdd runtime logging for model ID, revision, caller, and dependency versionLog samples
ミディアムSeparate caches by trust zoneInfrastructure change record
Long termMove model intake into a central service or loader packageArchitecture documentation

A security ticket for CVE-2026-4372 should not close with “upgraded package” unless the organization has verified where the package actually runs and whether untrusted models were loaded before the upgrade.

よくあるご質問

Is CVE-2026-4372 a remote code execution vulnerability?

  • Yes, but the practical trigger is model loading rather than a classic exposed network endpoint.
  • NVD describes it as a critical remote code execution vulnerability in Hugging Face Transformers before 5.3.0.
  • The attacker needs the victim or an automated workflow to load a malicious model repository.
  • The impact can still be severe if the loading environment has secrets, cloud permissions, internal network access, or sensitive datasets.(NVD)

Does trust_remote_code=False protect against CVE-2026-4372?

  • Not reliably in affected Transformers versions.
  • The core issue is that malicious configuration could reach a code-loading path despite the user not opting into remote code execution.
  • Keep trust_remote_code=False as a default, but do not treat it as sufficient on old versions.
  • Upgrade to Transformers 5.3.0 or later and add model source controls.(NVD)

Which Transformers versions should be upgraded?

  • Upgrade any Transformers installation below 5.3.0.
  • NVD marks versions before 5.3.0 as affected.
  • Pluto Security’s analysis narrows the practical vulnerable path to the 4.56.0 through 5.2.x range with the kernels package installed, but the safe operational rule is still to upgrade to 5.3.0 or later.
  • Verify the installed version inside containers, notebooks, CI runners, and long-running services, not only in the source repository.(NVD)

Do Safetensors prevent CVE-2026-4372?

  • No, not by themselves.
  • Safetensors helps reduce risk from unsafe weight deserialization.
  • CVE-2026-4372 involves configuration-driven behavior and kernel-loading logic, so weight format alone is not a complete control.
  • Use Safetensors where possible, but also validate configuration, restrict model sources, pin revisions, and sandbox model loading.(Hugging Face)

How can I check whether my environment is exposed?

  • Check whether Transformers is installed below 5.3.0.
  • Check whether the kernels package is installed.
  • Search code for from_pretrained(), AutoModelForCausalLM, AutoTokenizerそして pipeline() call sites.
  • Scan Hugging Face caches and local model directories for _attn_implementation_internal そして _experts_implementation_internal.
  • Review logs for untrusted model IDs loaded before patching.
  • Prioritize environments that loaded public or user-supplied model repositories.

Should I delete all Hugging Face cache directories?

  • Do not delete caches before preserving evidence if you are investigating possible exposure.
  • Cache directories may help identify which repositories and revisions were loaded.
  • After evidence collection, clearing caches can be useful to remove stale artifacts and force clean downloads from approved sources.
  • Separate future caches by trust zone so untrusted testing artifacts do not mix with production-approved models.

What should I do if an untrusted model was loaded before patching?

  • Stop the affected job, notebook, or service.
  • Preserve logs, cache contents, dependency versions, and container image identifiers.
  • Identify which model repositories and revisions were loaded.
  • Rotate any credentials available to the process.
  • Review endpoint, cloud, and network logs for suspicious activity after model load time.
  • Rebuild from a clean image with Transformers 5.3.0 or later.
  • Add model allowlisting, revision pinning, and sandboxing before resuming untrusted model evaluation.

What matters after the patch

CVE-2026-4372 should change how teams think about model loading. The lasting issue is not a single internal field name. The lasting issue is that AI artifact loading has become a code execution boundary in many organizations, while too many workflows still treat it like downloading a dataset.

Upgrade Transformers. Rebuild every runtime that can load models. Check caches and logs before deleting evidence. Rotate exposed credentials when untrusted models were loaded. Pin model revisions. Reject arbitrary model IDs from user input. Run external model evaluation in isolated environments with minimal secrets and controlled egress.

The next bug may not use _attn_implementation_internal. It may not involve kernels. It may not be in Transformers. The durable defense is to treat model repositories as software supply-chain inputs, with the same seriousness applied to packages, containers, plugins, and CI scripts.

記事を共有する
関連記事
jaJapanese