Wang Binluo comments

Results 11 comments of


                                            Wang Binluo

[BUG]: OOM during llama2 pretraining with flashattention and PP

> @wangbluo Could you please help me solve this issue? Thanks Hi, could you please offer the model size you use?

[BUG]:

@lyzKF Hello, I have the same error with you, when I tried to run: colossalai run --nproc_per_node 8 --host 10.90.5.14,10.90.8.153 --master_addr 10.90.5.14 auto_parallel_with_gpt.py I got the error : Error: failed...

[FEATURE]: Upgrade the transformers version from 4.33.0 to 4.36.0 for Shardformer.

> Hi! Thanks for this work. I am not a part of hpcaitech, but just wondering: are you planning to keep bumping the version up? Since HF transformers version is...

支持qwen吗

Hi, Colossal-LLama is not for qwen model, as they have different prompts. You can use ColossalChat to do sft，rm，ppo but pt. If your gpu resources is limit, we recommend you...

FasterMoE shadow expert implement

Hi, of course it's possible, as FasterMoe also use Expert Parallel. ![image](https://github.com/user-attachments/assets/38b9a6a1-f6b9-4456-81f9-37640853d7ae) Thank you for your attention for ColossalAI, firstly you can fork a repository of ColossalAI, then you can...

[DOC]: 环境安装失败

You should troubleshoot your issue from the following aspects (the provided log information is limited). First, check that there are no issues with your machine, for example, by running nvidia-smi...

[DOC]: 环境安装失败

Oh, I got it, seems like it's keeping compiling the JIT kernel op, it really takes some time and you didn't finish the compiling.

[BUG]: AttributeError: 'GeminiDDP' object has no attribute 'module'

Hi, could you please offer more error stack or error screen? Also your script and environment.

[BUG]: errror Colossalai 0.4.0/0.4.2 /usr/bin/supervisord

Please check your environment, the error meassage '/bin/bash: line 1: export: =/usr/bin/supervisord': not a valid identifier`' indicates that there was an attempt to set an environment variable without specifying a...

[FEATURE]: LoRA with sharded model

Hi, the HybridParallelPlugin has already support lora strategy. We would appreciate your feedback.