vllm icon indicating copy to clipboard operation
vllm copied to clipboard

Memory leak while using tensor_parallel_size>1

Open haiasd opened this issue 2 years ago • 12 comments

image

haiasd avatar Aug 08 '23 02:08 haiasd

Can you provide more details on what model are you using, and how many GPUs are you using? Any more details can be helpful. Thank you!

zhuohan123 avatar Aug 15 '23 20:08 zhuohan123

I'm running starcoder on 2*A10, The command is as follows: python -m vllm.entrypoints.api_server --model /model/starchat/starcoder-codewovb-wlmhead-mg2hf41 --tensor-parallel-size 2 --gpu-memory-utilization 0.90 --host 0.0.0.0 --port 8081 --max-num-batched-tokens 5120

haiasd avatar Aug 16 '23 05:08 haiasd

same question when loading llama2 70b models on 4 gpus

wonderseen avatar Nov 09 '23 13:11 wonderseen

same question when loading llama2 70b models on 2 gpus

ChristineSeven avatar Jan 18 '24 12:01 ChristineSeven

Same issue w/ Mixtral 8x7B Instruct 0.1 (non-quantized)

wangcho2k avatar Feb 01 '24 02:02 wangcho2k

we met the same issue with mistral7b: TP = 4 GPU = 4 * A10 vllm = 0.2.7

memory

PeterWang1986 avatar Feb 02 '24 12:02 PeterWang1986

tensor_parallel_size also meet memory leak. TP=1, GPU = 1*A30 vllm = 0.3.3

austingg avatar May 24 '24 03:05 austingg

tensor_parallel_size also meets memory leak. TP=2, GPU = 2*V100 vllm = 0.4.2

yarinlaniado avatar Jun 06 '24 11:06 yarinlaniado

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions[bot] avatar Oct 31 '24 02:10 github-actions[bot]

TP = 2, GPU=2 V100,llama3.1-8B-Instruct, vllm version = 0.6.2 . When I close the server it shows Warning as leaked memo.

ekmekovski avatar Nov 11 '24 13:11 ekmekovski

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions[bot] avatar Feb 12 '25 01:02 github-actions[bot]

This issue still exists

‫בתאריך יום ד׳, 12 בפבר׳ 2025 ב-3:59 מאת ‪github-actions[bot]‬‏ <‪ @.***‬‏>:‬

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

— Reply to this email directly, view it on GitHub https://github.com/vllm-project/vllm/issues/694#issuecomment-2652472107, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOIX7FM4BVIYBWFRG4BW6ST2PKTJ7AVCNFSM6AAAAAA3HYOVJCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNJSGQ3TEMJQG4 . You are receiving this because you commented.Message ID: @.***> [image: github-actions[bot]]github-actions[bot] left a comment (vllm-project/vllm#694) https://github.com/vllm-project/vllm/issues/694#issuecomment-2652472107

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

— Reply to this email directly, view it on GitHub https://github.com/vllm-project/vllm/issues/694#issuecomment-2652472107, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOIX7FM4BVIYBWFRG4BW6ST2PKTJ7AVCNFSM6AAAAAA3HYOVJCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNJSGQ3TEMJQG4 . You are receiving this because you commented.Message ID: @.***>

yarinlaniado avatar Feb 14 '25 16:02 yarinlaniado

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions[bot] avatar May 16 '25 02:05 github-actions[bot]