Guangyi Zhang comments

Results 10 comments of


                                            Guangyi Zhang

[BUG] RuntimeError: Tensors must be contiguous error while finetuning with deepspeed.

facing th same issue. ``` Traceback (most recent call last): File "/home/nlp/zgy/VLM/src/train/train_mem.py", line 12, in train() File "/home/nlp/zgy/VLM/src/train/train.py", line 395, in train trainer.train() File "/home/nlp/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/trainer.py", line 1537, in train return...

AttributeError: 'DeepSpeedZeRoOffload' object has no attribute 'backward'

Hi, I get the same issue too. my ds config : ```yaml { "bf16": { "enabled": "true" }, "zero_optimization": { "stage": 3, "offload_optimizer": { "device": "cpu", "pin_memory": true }, "offload_param":...

AttributeError: 'DeepSpeedZeRoOffload' object has no attribute 'backward'

@tjruwase Thank you for your reply, I passed optimizer and scheduler to my ds_config. I use accelerate for training and I get a new error: #https://github.com/huggingface/transformers/issues/26148 This seems to be...

AttributeError: module 'whisper' has no attribute 'load_audio'

hello，我也遇到了同样的问题，我的信息如下： os: macOS python: 3.10 autocut install by the following cmd: python steup.py install 请问有解决办法吗？

AttributeError: module 'whisper' has no attribute 'load_audio'

> 感谢回复，我试了pip install，问题解决了。

[Bug]: After updating to vllm 0.5.3.post1, "Exception in worker VllmWorkerProcess while processing method init_device: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method."

I encountered the same problem ```shell RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method ``` when using vllm version...

[Bug]: After updating to vllm 0.5.3.post1, "Exception in worker VllmWorkerProcess while processing method init_device: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method."

@youkaichao here is my environment information: ``` PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: CentOS Stream 8...

[Bug]: After updating to vllm 0.5.3.post1, "Exception in worker VllmWorkerProcess while processing method init_device: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method."

@youkaichao I followed the instruction [here](https://github.com/OpenBMB/MiniCPM-V/issues/448#issuecomment-2283033953) to uninstall `flash-attn`, and the code ran successfully. After trying #7411 , no errors occurred either. Thank you!

I keep getting 403 Forbidden even when using a URL/Token that was just generated

same

[Question]: Why ragflow need to use a model named "qwen2-instruct-1-0" when the Chunk Method is "Knowledge Graph"？

我也遇到了同样的问题。我用`xinference`部署的`bge-m3`和`glm4-chat` ![image](https://github.com/user-attachments/assets/81dc5767-f360-4666-b3c2-417b1d8e9859) 但在`ragflow`中模型名称都变了，加了"-1-0"的后缀: ![image](https://github.com/user-attachments/assets/b2478f97-6507-43bf-9c3d-c00dc16fa22e) ![image](https://github.com/user-attachments/assets/f078948c-29bf-4fcc-9d0c-6014e81706a6)