DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Example models using DeepSpeed

Results 274 DeepSpeedExamples issues
Sort by recently updated
recently updated
newest added

```2 pytorch allocator cache flushes since last step. this happens when there is high memory pressure and is detrimental to performance. if this is happening frequently consider adjusting settings to...

Applies optimization to the SD example: - Adds optimized_iteration flag. This flag determines which portion of the iterations to be optimized. For instance optimized_iteration = 0 means no optimization and...

Hi I am from the ColossalAI team. I found that there are similarities between DeepSpeedChat and ColossalChat. We found that there might be some implementation error in our code, thus...

question
deespeed chat

Hi, do you plan on adding support for T5, UL2 models? Thanks!

An error occurred when running pipeline_parallelism ValueError: optimizer got an empty parameter list

When I follow this [https://www.deepspeed.ai/tutorials/model-compression/#2-tutorial-for-zeroquant-efficient-and-affordable-post-training-quantization](url) run the zero_quant.sh or (quant_activation.sh and quant_weight.sh), the model size still is 418mb as the bert-base. ![image](https://user-images.githubusercontent.com/49281157/222089617-33fcb1cc-9aee-419a-8a02-9c366179d653.png) the clean_model weight still save as float32? Can...

Hi all, Thanks for great works. I ran some experiments with Deepspeed compression using configs in model_compression/bert. I got some issues: - Size of output model when using DeepSpeedExamples/model_compression/bert/bash_script/XTC/quant_1bit.sh config...

The port number under the CIFAR Model Compression example exceeded the allowed range, possibly a typo—a fix.

Hi I'm trying to reproduce GLUE scores reported on ZeroQuant paper. But overally almost every tasks' accuracy is lower than reported results in LKD cases. Escpecially RTE shows sharp degredation...

I was trying to run Megatron with ZeRO 2 config when I encountered this error ``` > finished creating GPT2 datasets ... setting training data start iteration to 0 setting...