Seungduk Kim

Results 8 comments of Seungduk Kim

I verified adding this to DeepSpeed config files resolves this issue. ``` "gradient_clipping": "auto" ```

Could it be a quantized model? Looking at the config file, there is a `quantization_config` section. ``` "quantization_config": { "_load_in_4bit": true, "_load_in_8bit": false, "bnb_4bit_compute_dtype": "bfloat16", "bnb_4bit_quant_type": "nf4", "bnb_4bit_use_double_quant": true, "llm_int8_enable_fp32_cpu_offload":...

Hi @nartmada, I cannot access the machine anymore since it was a short-time PoC. I could get a chance to reaccess the MI300X machine this month. When the access is...

Hi @nartmada, Thanks for checking in. Yes, I had and this time I did not have any issues with the bitsandbytes module. However, this time, I had another issue with...

I tried to save the tensors when it happened but it was challenging and I failed. One thing I can tell is that it happens when I modify the gradients...

> I follow the file change to change the file ,but when I use the below code : `python -m vllm.entrypoints.openai.api_server --model /root/DeepSeek-V2-Chat --trust-remote-code` there is a error happen: >...

I made a fix with recent changes on vLLM. https://github.com/seungduk-yanolja/vllm-deepseek Assuming you have an 8xH100 machine, to run, ``` python -m vllm.entrypoints.openai.api_server --model deepseek-ai/DeepSeek-V2-Chat -tp 8 --served-model-name deeps eek --trust-remote-code...

> > I made a fix with recent changes on vLLM. https://github.com/seungduk-yanolja/vllm-deepseek > > Assuming you have an 8xH100 machine, to run, > > ``` > > python -m vllm.entrypoints.openai.api_server...