DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Example models using DeepSpeed

Results 323 DeepSpeedExamples issues
Sort by recently updated
recently updated
newest added

https://github.com/microsoft/DeepSpeedExamples/blob/e7c8cb767acddba8ad5d2c41fe18e30de7870d30/model_compression/bert/huggingface_transformer/modeling_bert.py#L383 In example of model compression, it says only change is line 383 "where we output attention_scores instead of attention_prob.". But this line is the same as hugging face and...

How to observe the performance of alexnet's parallelism in the [pipeline parallel package](https://github.com/microsoft/DeepSpeedExamples/tree/master/pipeline_parallelism),and can we use nsys to analyze the bubbles in parallel?

hi,when i use compression (zero_quant method), I found that we need to change conv1d to linear in GPT model. I just want to figure out this action. could you tail...

I currently have some tests on Zero3 infinite and have had some problems and would like your help. **Machine configuration**: two nodes, each node a piece of A100-PCIE-40GB, RAM 126G...

I rerun the shell file `run_squad_baseline.sh` under BingBertSquad without any modification on 8 gpus V100, the pretrain mode is [https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin](https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin), but I didn't get right results. The [document](https://www.deepspeed.ai/tutorials/bert-finetuning/) says for...

I try to run the BERT with pipeline parallelism, but I get an error: File "DeepSpeedExamples/Megatron-LM-v1.1.5-3D_parallelism/pretrain_bert.py", line 146, in args_defaults={'tokenizer_type': 'BertWordPieceLowerCase'}) File "/DeepSpeedExamples/Megatron-LM-v1.1.5-3D_parallelism/megatron/training.py", line 81, in pretrain model, optimizer, lr_scheduler...

Using the GAN example and the following deepspeed config for ZeRO 3 offload, I get the following error: ``` { "train_batch_size": 64, "zero_optimization": { "stage": 3, "contiguous_gradients": true, "stage3_max_live_parameters": 1e9,...

Following the example in [HelloDeepSpeed](https://github.com/microsoft/DeepSpeedExamples/tree/master/HelloDeepSpeed), yet I still have to CUDA OOM despite moving all the way to stage 3 on the configuration below. ``` deepspeed train_bert_ds.py --checkpoint_dir . --num_layers...

I used the default setting and run the code with 32 V100, the data was constructed from nvidia scripts. I was able to reproduce nvidia's results on Squad (F1=90), but...

Error encountered running DeepSpeedExamples/HelloDeepSpeed/train_bert.py ```console $ python train_bert.py --checkpoint_dir ./experiments Traceback (most recent call last): File "train_bert.py", line 9, in import datasets ModuleNotFoundError: No module named 'datasets' ```