DeepSpeedExamples issues

The example of bert compression did not change line 383 in modeling_bert.py?

https://github.com/microsoft/DeepSpeedExamples/blob/e7c8cb767acddba8ad5d2c41fe18e30de7870d30/model_compression/bert/huggingface_transformer/modeling_bert.py#L383 In example of model compression, it says only change is line 383 "where we output attention_scores instead of attention_prob.". But this line is the same as hugging face and...

drxmy

Bubbles in Pipeline Parallelism

How to observe the performance of alexnet's parallelism in the [pipeline parallel package](https://github.com/microsoft/DeepSpeedExamples/tree/master/pipeline_parallelism),and can we use nsys to analyze the bubbles in parallel?

QiaolingChen00

[Question]a bout compression.helper.convert_conv1d_to_linear

hi，when i use compression (zero_quant method), I found that we need to change conv1d to linear in GPT model. I just want to figure out this action. could you tail...

xk503775229

Is GPU throughput reasonable?

3

I currently have some tests on Zero3 infinite and have had some problems and would like your help. **Machine configuration**: two nodes, each node a piece of A100-PCIE-40GB, RAM 126G...

Crispig

BingBertSQuAD Fine-tuning result mismatch tutorial document

1

I rerun the shell file `run_squad_baseline.sh` under BingBertSquad without any modification on 8 gpus V100, the pretrain mode is [https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin](https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin), but I didn't get right results. The [document](https://www.deepspeed.ai/tutorials/bert-finetuning/) says for...

wenqf11

BERT in Megatron-LM-v1.1.5-3D_parallelism does not support pipeline parallelism

1

I try to run the BERT with pipeline parallelism, but I get an error: File "DeepSpeedExamples/Megatron-LM-v1.1.5-3D_parallelism/pretrain_bert.py", line 146, in args_defaults={'tokenizer_type': 'BertWordPieceLowerCase'}) File "/DeepSpeedExamples/Megatron-LM-v1.1.5-3D_parallelism/megatron/training.py", line 81, in pretrain model, optimizer, lr_scheduler...

eddy16112

ZeRO 3 not working with GAN example: IndexError: tuple index out of range

Using the GAN example and the following deepspeed config for ZeRO 3 offload, I get the following error: ``` { "train_batch_size": 64, "zero_optimization": { "stage": 3, "contiguous_gradients": true, "stage3_max_live_parameters": 1e9,...

zpxp

OOM despite ZeRO stage 3

1

Following the example in [HelloDeepSpeed](https://github.com/microsoft/DeepSpeedExamples/tree/master/HelloDeepSpeed), yet I still have to CUDA OOM despite moving all the way to stage 3 on the configuration below. ``` deepspeed train_bert_ds.py --checkpoint_dir . --num_layers...

kchu02

unable to prodcude bing_bert with nvidia data

11

I used the default setting and run the code with 32 V100, the data was constructed from nvidia scripts. I was able to reproduce nvidia's results on Squad (F1=90), but...

1024er

Error encountered running DeepSpeedExamples/HelloDeepSpeed/train_bert.py

3

Error encountered running DeepSpeedExamples/HelloDeepSpeed/train_bert.py ```console $ python train_bert.py --checkpoint_dir ./experiments Traceback (most recent call last): File "train_bert.py", line 9, in import datasets ModuleNotFoundError: No module named 'datasets' ```

jeyblu

DeepSpeedExamples
DeepSpeedExamples copied to clipboard

Metadata

The example of bert compression did not change line 383 in modeling_bert.py?

Bubbles in Pipeline Parallelism

[Question]a bout compression.helper.convert_conv1d_to_linear

Is GPU throughput reasonable?

BingBertSQuAD Fine-tuning result mismatch tutorial document

BERT in Megatron-LM-v1.1.5-3D_parallelism does not support pipeline parallelism

ZeRO 3 not working with GAN example: IndexError: tuple index out of range

OOM despite ZeRO stage 3

unable to prodcude bing_bert with nvidia data

Error encountered running DeepSpeedExamples/HelloDeepSpeed/train_bert.py

← Metadata

Owner

Metadata

DeepSpeedExamples DeepSpeedExamples copied to clipboard

Metadata

← Metadata

Owner

Metadata

DeepSpeedExamples
DeepSpeedExamples copied to clipboard