Tian Lan

Results 10 issues of Tian Lan

Hi, I am looking at the PPO implementation, and I am curious about this part (actually many other implementations are using this workflow as well, so I am also curious...

Hi, I am very interested in the distributed inference of Colossal AI. Since we have pre-trained NLP models from Pytorch or JAX, I wonder if possible or what should be...

My questions are mostly for the stage 3, according to the doc https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/training_scripts/README.md it says that ``` If you don't have step 1 and step 2 models. You may simply...

Hi, I am DPO training a checkpoint of Mixtral-8x7B-Instruct, from the previous supervised finetune. I mainly followed this script https://github.com/huggingface/trl/blob/main/examples/research_projects/stack_llama_2/scripts/dpo_llama2.py with 8 H100 GPUs, flash attn and deepspeed zero 2,...

I meet a few errors when running `train_with_warp_drive`, one is `from scripts.run_unittests import import_class_from_path` seem not good, it will complain `scripts` cannot be found in module, so I remove `scripts.`,...

Hello, I looked at your bA3C code and learned a lot. Thank you so much for sharing your codebase implementing this idea. I have two questions mainly on your code....

It looks like FSDP is a pretty awesome module to distribute the base model, but does this codebase support Lora fine tuning? I think usually what we would like DPO...

I am wondering for multi-node FSDP, does `local_rank` and `rank` have any obvious difference here? I think I understand that `local_rank` is the rank within a node. I see in...

Hello, When I use accelerator and deepspeed Zero3 to train the model in one node with 8 GPUs, the following code smoothly saves the model checkpoint ``` ds_state_dict = model._zero3_consolidated_16bit_state_dict()...

Stale