DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Example models using DeepSpeed

Results 323 DeepSpeedExamples issues
Sort by recently updated
recently updated
newest added

It's great tool to show how to build a chatgpt-like model based on some foundation models. I wonder could it support the encoder-decoder model, like Flan-T5? Could we just directly...

I want to finetune bloom_1.1b in chinese dataset, and run run_chinese.sh. But in run_chinese.sh, where is the **ds_config.json** file.

`mtimes` is multiple by 1000 to get the time in the unit of ms in `print_latency`, but it is already in the unit of ms. `use_cuda_events` is true by default...

Dear all, When we use our customized GPT model for step 3 training, we get the kernel execution error. (m: 5120, n: 8, k: 1706, error: 14), however once we...

bug
deespeed chat

使用deepspeed 多机分布式训练,加载opt-1.3b 模型的时候,报a leaf Variable that requires grad is being used in an in-place operation错误

Hi~ While running step2 reward model training, I got a strange result after one epoch training: ***** Evaluating reward, Epoch 1/1 ***** chosen_last_scores (higher is better) : -9.388486862182617, acc (higher...

bug
deespeed chat

In the [applications/DeepSpeed-Chat/training/step2_reward_model_finetuning/main.py](url) line 254-258 ``` scores += outputs["chosen_mean_scores"].mean().float() if step == 99: # For faster evaluation and debugging break acc = correct_predictions / total_predictions scores = scores / (step...

bug
deespeed chat

Dear all, We are trying to reproduce the results, however, as we follow the training steps, our chatbot is keep repeating a nonsense. We suspect that our RLHF part is...

bug
deespeed chat

I found a little of typo on some documetations&codes `--only_optimizer_lora` for `--only_optimize_lora`. So I fixed all the files. Thank you.