DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Example models using DeepSpeed

Results 274 DeepSpeedExamples issues
Sort by recently updated
recently updated
newest added

This PR integrates LoRA optimization into the Stable Diffusion training example, building upon the already implemented distillation benefits. By applying LoRA-enhanced distillation, we achieve further improvements, including reduced inference time,...

Fix the following warning: ```bash venv/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations warnings.warn( ```

## Version deepspeed: `0.13.4` transformers: `4.38.1` Python: `3.10` Pytorch: `2.1.2+cu121` CUDA: 12.1 ## Error in Example (To reproduce) Just simply run this script https://github.com/microsoft/DeepSpeedExamples/blob/master/inference/huggingface/text-generation/inference-test.py ```bash deepspeed --num_gpus 8 inference-test.py --model...

In my experiment, I found that if I use zero3 and enable hybrid engine setting, the Actor will generate repeat token or nothing during stage 3 (PPO) training. Here is...

Does dschat example support codellama model fine tuning ?

**System Info:** Memory: 500G GPU: 8 * A100 80G Question: **Why using multi gpus in init of DeepSpeedRLHFEngine used much more memroy compared to using single gpu ?** **Reproduce:** Copy...

deespeed chat
system
llama

The mii inferencing benchmark script computes throughput as [num_clients/latency](https://github.com/microsoft/DeepSpeedExamples/blob/master/benchmarks/inference/mii/src/postprocess_results.py#L73). Shouldn't this be `num_queries/latency`? Also why use P95 latency and not the total time it took to process all the requests,...

Considering the advantages of [DPO(Direct Preference Optimization)](https://arxiv.org/abs/2305.18290) as being "stable, performant, and computationally lightweight, eliminating the need for fitting a reward model, sampling from the LM during fine-tuning, or performing...

Hi I tried to use the method "get_model_profile" to get the latency and flop for my model. To get avoid of the influence from randomness, I used this method in...

https://github.com/microsoft/DeepSpeedExamples/blob/6c31d8ddee9e57f6202aeb4ee3c86f2fbd93d4c6/applications/DeepSpeed-Chat/dschat/utils/data/data_utils.py#L210 and L211, L214, L215