VLMEvalKit
VLMEvalKit copied to clipboard
[Bug] VRAM is not released when using multiple model
Hi, thanks for your contribution on building this evluation kit. I used it for reproducing Qwen2.5-VL-3B-Instruct and Qwen2.5-VL-7B-Instruct, recently. I construct a model config with this two model and one dataset, which is shown below. However, the VRAM allocated from previous Qwen2.5-VL-3B-Instruct model seems not released, after this model is done. As we can see, the VRAM is nearly the sum of 3B model and 7B model.
config:
{
"model": {
"Qwen2.5-VL-3B-Instruct-edge": {
"class": "Qwen2VLChat",
"model_path": "Qwen/Qwen2.5-VL-3B-Instruct",
"min_pixels": 3136,
"max_pixels": 802816,
"use_custom_prompt": false
},
"Qwen2.5-VL-7B-Instruct-edge": {
"class": "Qwen2VLChat",
"model_path": "Qwen/Qwen2.5-VL-7B-Instruct",
"min_pixels": 3136,
"max_pixels": 802816,
"use_custom_prompt": false
}
},
"data": {
"MMMU_DEV_VAL": {
"class": "MMMUDataset",
"dataset": "MMMU_DEV_VAL"
}
}
}
nvidia-smi:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.05 Driver Version: 560.35.05 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 D Off | 00000000:01:00.0 Off | Off |
| 37% 62C P2 286W / 425W | 23388MiB / 24564MiB | 96% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3805914 C python 23378MiB |
+-----------------------------------------------------------------------------------------+
Hi, @hebangwen , Thanks for point out, we will see how to fix the problem. For now, a simple workaround is to write a for loop in bash and evaluate a single model each time.