Vulnerability Report: model load vulnerable to arbitrary code execution via untrusted file deserialization

Open ybdesire opened this issue 7 months ago • 1 comments

trafficstars

Description

The code snippets below from are vulnerable to CWE-502: Deserialization of Untrusted Data. This vulnerability exists because the torch.load function is used without the weights_only=True parameter, which is a security risk.

（1） https://github.com/lm-sys/FastChat/blob/f475817683152f40eed36e72ae0bd9f976816fc4/fastchat/model/apply_delta.py#L37

（2） https://github.com/lm-sys/FastChat/blob/f475817683152f40eed36e72ae0bd9f976816fc4/fastchat/model/apply_delta.py#L90

（3） https://github.com/lm-sys/FastChat/blob/f475817683152f40eed36e72ae0bd9f976816fc4/fastchat/model/apply_delta.py#L97

（4） https://github.com/lm-sys/FastChat/blob/f475817683152f40eed36e72ae0bd9f976816fc4/fastchat/model/apply_delta.py#L102

In Python, the torch.load function is utilized to load serialized tensors or models. When it is used to load untrusted data, like malicious pickle data, there is a risk of arbitrary code execution during the deserialization process. In the provided code snippets, the torch.load function is used multiple times, such as in the split_files and apply_delta_low_cpu_mem functions. For example, in the split_files function, state_dict = torch.load(file_path) is used without the weights_only=True parameter. Similarly, in the apply_delta_low_cpu_mem function, state_dict = torch.load(base_file) and delta_state_dict = torch.load(delta_file) are used without this security parameter.

This lack of the weights_only=True parameter means that if an attacker manages to manipulate the data being loaded (the file_path, base_file, or delta_file), they could potentially inject malicious code. This code would then be executed when the torch.load function deserializes the data, leading to serious security issues like unauthorized access, data leakage, or system compromise. To mitigate this vulnerability, the weights_only=True parameter should be added to all torch.load calls in the relevant code sections.

Exploit Steps

Prerequisites

The attacker has access to the system where the vulnerable code from the provided GitHub links is running. This could be through a network connection if the application is publicly accessible or by compromising an internal system.
The attacker can manipulate the input data files (e.g., pt_filename in the context of torch.load) that the vulnerable torch.load functions are using.

Steps

Prepare Malicious Pickle Data
- The attacker first creates a malicious pickle file. This file contains Python code that, when deserialized, will execute arbitrary commands. For example, an attacker could create a pickle file that tries to open a reverse shell to their own machine or exfiltrate sensitive data.
Manipulate the Input - The attacker then replaces the legitimate input file (e.g., pt_filename in the torch.load calls) with the malicious pickle file they just created. If the application fetches the data from a specific location or through a specific mechanism, the attacker needs to ensure that the torch.load function will load the malicious file.
Trigger the Vulnerability
- Once the input is manipulated, the attacker waits for the application to execute the torch.load function. This could be triggered by normal usage of the application, such as when the application is trying to load a model or some serialized data.
- When the vulnerable torch.load function (e.g., loaded = torch.load(pt_filename, map_location="cpu") without the weights_only=True parameter) is called with the malicious pickle file as input, the deserialization process will execute the arbitrary code defined in the malicious pickle data.
Exploit the System
- After the arbitrary code is executed, the attacker can take advantage of the compromised system. This could involve stealing sensitive data, installing malware, or gaining unauthorized access to other parts of the system. For example, if the malicious code opens a reverse shell, the attacker can use this shell to interact with the compromised system and perform further attacks.

Apr 07 '25 14:04 ybdesire

FastChat FastChat copied to clipboard

Vulnerability Report: model load vulnerable to arbitrary code execution via untrusted file deserialization

Description

Exploit Steps

Prerequisites

Steps

FastChat
FastChat copied to clipboard