Penligent Header

The Panopticon of Metadata: A Definitive Guide to Ghunt V2, Google API Reverse Engineering, and Agentic OSINT

In the modern adversarial landscape, “Privacy” is an operational failure, and “Anonymity” is merely a temporary state of uncorrelation. For the elite Red Teamer or AI Security Engineer, the goal of Open Source Intelligence (OSINT) is not to find information that is hidden, but to weaponize information that is public.

Ghunt has transcended its origins as a simple Python script to become the industry-standard framework for interrogating the Google ecosystem. It does not exploit a bug; it exploits the fundamental architecture of connected services. By dissecting the API calls between a user’s device and Google’s servers—specifically leveraging the Play Services, Maps, and People APIs—Ghunt transforms a sterile email address into a dynamic “Pattern of Life” graph.

This comprehensive technical dossier will deconstruct the internal mechanics of Ghunt V2, provide production-grade Python automation strategies, analyze its role in the exploitation of high-impact vulnerabilities like CVE-2026-21858 and CVE-2025-68613, and demonstrate how Penligent.ai is automating this entire kill chain through Agentic AI.

Ghunt via Penligent

Part I: The Mechanics of Extraction (Under the Hood)

To understand how to use Ghunt effectively, one must understand what it is doing on the wire. Ghunt is, at its core, an authenticated API client that emulates legitimate Google applications.

1. The Credential Context: Anatomy of a Session

Ghunt does not function anonymously. It requires a valid session state to bypass Google’s “Consent Bump” and anti-scraping heuristics. The “magic” lies in specific cookies, primarily:

  • __Secure-1PSID & __Secure-3PSID: These are the primary identifiers for a persistent Google session. They grant access to the “Personal Search” context.
  • oauth_token: Often extracted to authorize calls to the Google People API without triggering a full OAuth consent screen.

When you execute ghunt login, you are essentially exporting this cryptographic state from your browser context into a Python httpx session. The tool then manages the X-Goog-AuthUser headers to switch between accounts, mimicking the behavior of a user toggling between personal and corporate profiles in Chrome.

2. Decoding the Matrix: Protocol Buffers (Protobuf)

Unlike standard REST APIs that return clean JSON, many internal Google endpoints (especially Maps and Play Store) return data serialized in Protocol Buffers (Protobuf).

  • The Challenge: Protobuf is a binary format. Without the .proto definition files (which Google keeps private), the data is a stream of unreadable bytes.
  • The Ghunt Solution: Ghunt’s developers have reverse-engineered the wire format. The tool includes a custom parser that recursively deserializes these binary streams.
    • Example: When querying a user’s Maps contributions, Ghunt receives a Protobuf blob. It strips the wire headers, identifies field tags (e.g., field 1 is the Review ID, field 2 is the Timestamp), and reconstructs a readable object. This ability to parse raw, internal binary traffic is what separates Ghunt from basic HTML scrapers.
Ghunt via Penligent

Part II: The Modular Attack Surface

Ghunt V2 operates on a modular architecture. Each module targets a specific data silo within the Google infrastructure.

Module A: The Email Pivoting Engine (ghunt email)

This is the most common entry point. You feed it [email protected], and it queries the internal endpoint https://clients6.google.com/rpc.

  • GAIA ID Resolution: The module resolves the email to a GAIA ID (Google Accounts and ID Administration). This 21-digit integer is the immutable core of a user’s identity. Even if the user changes their email address or display name, the GAIA ID persists.
  • Service Enumeration: It checks for the existence of public profiles on Photos, Maps, and Calendar.
  • Cloud IAM Leakage: Crucially, it can identify if the user has an active footprint in Google Cloud Platform (GCP). For a Red Teamer, knowing a target is a “GCP Admin” immediately prioritizes them for phishing campaigns targeting gcloud credentials.

Module B: The Geolocation Trilateration (ghunt maps)

This module aggregates “Reviews” and “Photos” posted by the target.

  • Timestamp Analysis: By correlating the timestamps of reviews with the geocoordinates of the businesses reviewed, Ghunt builds a velocity map.
    • Scenario: A target reviews a coffee shop in San Francisco at 09:00 and a restaurant in Palo Alto at 12:00. This confirms their physical location and travel capability.
  • Confidence Scoring: Ghunt V2 calculates a confidence score for the user’s “Home Base” by clustering review density.

Module C: Device Telemetry & The Play API (ghunt play)

Perhaps the most dangerous module for hardware exploitation. By querying the Google Play library, Ghunt lists the devices linked to the account.

  • Data Extracted: Manufacturer (Samsung, Pixel, Xiaomi), Model Number (SM-S918B), and “Last Seen” timestamp.
  • Weaponization: This is not just trivia. It is the prerequisite for hardware-specific Remote Code Execution (RCE).

Part III: Advanced Engineering – Automating the Hunt

Manual CLI usage is insufficient for enterprise-scale reconnaissance. Below is a robust, production-ready Python implementation that wraps Ghunt in an asynchronous class structure, suitable for integration into a larger pipeline.

The GhuntAutomator Class

Python

`import asyncio import json import logging from typing import Dict, Optional from ghunt.api import GHuntAPI from ghunt.objects import Target from ghunt.utils import get_httpx_client

Configure structured logging for the pipeline

logging.basicConfig(level=logging.INFO, format=’%(asctime)s – %(levelname)s – %(message)s’)

class GhuntAutomator: def init(self): self.client = get_httpx_client() self.api = GHuntAPI(self.client) self.is_authenticated = False

async def authenticate(self) -> bool:
    """
    Initializes the session using stored credentials.
    Assumes cookies are already serialized in the default path.
    """
    try:
        self.is_authenticated = await self.api.login()
        if self.is_authenticated:
            logging.info("[+] Successfully authenticated to Google Internal APIs.")
            return True
        else:
            logging.error("[-] Authentication failed. Check cookie validity.")
            return False
    except Exception as e:
        logging.critical(f"[-] Fatal Auth Error: {e}")
        return False

async def scan_target(self, email: str) -> Optional[Dict]:
    if not self.is_authenticated:
        logging.warning("[-] Cannot scan: Not authenticated.")
        return None

    logging.info(f"[*] Initiating Deep Scan for: {email}")
    
    try:
        target = Target(self.api, email)
        found = await target.hunt()
        
        if not found:
            logging.info(f"[-] Target {email} not found or hidden.")
            return None

        # 1. Base Intelligence
        intel_package = {
            "gaia_id": target.person.gaia_id,
            "name": target.person.name,
            "profile_pic": target.person.profile_pic,
            "last_active": target.person.last_active_service,
            "devices": [],
            "locations": []
        }

        # 2. Extract Device Telemetry (Critical for CVE mapping)
        if target.person.devices:
            for device in target.person.devices:
                intel_package["devices"].append({
                    "model": device.model,
                    "last_seen": str(device.last_seen)
                })
        
        # 3. Extract Geolocation Data from Maps
        if target.person.maps_contribs:
            for review in target.person.maps_contribs[:5]: # Top 5 recent reviews
                intel_package["locations"].append({
                    "venue": review.venue_name,
                    "rating": review.rating,
                    "timestamp": str(review.date)
                })

        return intel_package

    except Exception as e:
        logging.error(f"[-] Error during scan logic: {e}")
        return None

async def cleanup(self):
    await self.client.aclose()


Ghunt via Penligent

Execution Harness

async def main(): automator = GhuntAutomator() if await automator.authenticate(): result = await automator.scan_target(“[email protected]“) if result: print(json.dumps(result, indent=4)) await automator.cleanup()

if name == “main“: asyncio.run(main())`

This code is pipeline-ready. It includes error handling, structured JSON output, and modular design. It can be containerized and triggered by a message queue (like Kafka or RabbitMQ) in a larger distributed system.

Part IV: The Kill Chain – Intersecting OSINT with CVEs

The true value of Ghunt appears when its output is overlaid with the CVE (Common Vulnerabilities and Exposures) landscape of 2026.

1. The Hardware Vector: CVE-2026-21858 (CoreAudio RCE)

Context: CVE-2026-21858 is a critical remote code execution vulnerability in the media processing engine of Android 15 and earlier. It allows an attacker to execute arbitrary code by sending a malformed audio file via MMS or WhatsApp.

  • The Problem: The exploit is hardware-dependent. Sending the wrong payload to the wrong architecture (e.g., sending an Exynos payload to a Snapdragon device) causes a crash, alerting the user.
  • The Ghunt Solution: By running ghunt play, the attacker retrieves the exact model (e.g., “Pixel 7 Pro”). They cross-reference this with the CVE-2026-21858 compatibility matrix.
  • The Result: A 100% reliable, zero-click compromise. Ghunt provides the targeting solution for the missile.

2. The Identity Vector: CVE-2025-68613 (OAuth Token Leak)

Context: CVE-2025-68613 involves a flaw in how certain third-party integrations handle Google OAuth refresh tokens.

  • The Problem: Attackers need to know which integrations a user has authorized. Spraying blind exploits is noisy.
  • The Ghunt Solution: Ghunt’s “SpiderDAL” (Digital Asset Links) and “Photos” modules can often infer third-party connections. If Ghunt shows the user is posting photos from a specific third-party app, the attacker knows a valid OAuth token exists for that integration.
  • The Result: The attacker initiates the CVE-2025-68613 flow specifically against the app identified by Ghunt, stealing the persistent refresh_token and achieving permanent account takeover (ATO).

Part V: The Future – From Scripts to Agentic AI (Penligent.ai)

While the Python script above is powerful, it represents the “Manual Era” of security. It requires a human to interpret the JSON, look up the CVE, and craft the payload. In 2026, speed is the deciding factor. This is where Agentic AI takes over.

Penligent.ai, the world’s first Agentic AI Hacker, absorbs the logic of tools like Ghunt into a cognitive architecture. It does not just “run” Ghunt; it “understands” the output.

Try Ghunt via Penligent

The Cognitive Loop of a Penligent Agent:

  1. Observation (The Sensor): The Agent autonomously deploys a Ghunt-like sensor against a list of corporate emails.
  2. Orientation (The Context):
    • Raw Data: “User A uses a Samsung S24.”
    • Knowledge Retrieval: The Agent queries its internal vector database of vulnerabilities. It matches “Samsung S24” + “Android 15” to CVE-2026-21858.
  3. Decision (The Strategy): The Agent calculates the path of least resistance.
    • Option A: Phishing email? (Success rate 12%).
    • Option B: Mobile RCE via CVE-2026-21858? (Success rate 95%).
    • Selection: The Agent selects Option B.
  4. Action (The Validation): The Agent generates a Safe Proof-of-Concept (PoC) artifact. It does not exploit the device maliciously; it validates the vulnerability exists (e.g., by checking if the device accepts the malformed header) and logs the finding.

This is the difference between a tool and a teammate. Penligent automates the reasoning, not just the execution. For a security team, this means finding the critical path to compromise in minutes, not days.

Part VI: Defensive Countermeasures & OpSec

If you are Red Teaming, you must protect your infrastructure. If you are Blue Teaming, you must blind the attacker.

For the Operator (OpSec)

  • Dockerization is Mandatory: Never run Ghunt on your host OS. Google leaves “breadcrumbs” in the file system. Use the official Docker image: docker run -v $(pwd)/resources:/usr/src/app/resources mxrch/ghunt login
  • Residential Proxies: Datacenter IPs (AWS, DigitalOcean) are flagged instantly by Google. You must tunnel Ghunt traffic through high-reputation Residential Proxies to mimic normal user traffic.
  • Account Aging: Do not use a brand-new Gmail account for reconnaissance. Google trusts accounts with history. Use “seasoned” burner accounts.

For the Defender (Mitigation)

  • The “Block” List: You cannot block Ghunt directly, as it looks like legitimate traffic. You must reduce the data surface.
  • Enterprise Policy:
    1. Disable “Web & App Activity”: Enforce this via Google Workspace Admin policies for all corporate users. This stops the collection of Map reviews and location history.
    2. Shadow IT Auditing: Regularly run Ghunt against your own executives. If you find their personal devices listed, they are mixing personal and corporate identities—a violation of the “Separation of Duties” principle.
    3. Endpoint Isolation: Ensure that mobile devices accessing corporate data are managed (MDM). If a device is identified as vulnerable to CVE-2026-21858, the MDM should automatically revoke its access tokens until patched.

Conclusion

Ghunt V2 is a testament to the fact that in a hyper-connected world, we are all leaking data. For the security engineer, it provides the granularity needed to simulate sophisticated threat actors.

However, the manual correlation of OSINT data with complex vulnerability chains like CVE-2026-21858 and CVE-2025-68613 is becoming unsustainable for human teams. The future lies in platforms like Penligent.ai, which fuse the precision of tools like Ghunt with the reasoning power of Agentic AI, allowing us to identify and close these exposure windows at machine speed.

References:

Share the Post:
Related Posts
en_USEnglish