DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Example models using DeepSpeed

Results 274 DeepSpeedExamples issues
Sort by recently updated
recently updated
newest added

Hi there, The tutorial https://www.deepspeed.ai/tutorials/bert-finetuning/#loading-huggingface-and-tensorflow-pretrained-models makes clear how to load HF and TF checkpoints into Deepspeed. What if we want to load a Deepspeed checkpoint, like from the Bing BERT...

Error encountered running DeepSpeedExamples/BingBertSquad/run_squad_deepspeed.sh ```console $ ./run_squad_deepspeed.sh 16 ~/models/bert-base-uncased/pytorch_model.bin ~/datasets/squad_data ~/output 11/23/2021 15:42:24 - INFO - __main__ - Loading Pretrained Bert Encoder from: /home/bduser/models/bert-base-uncas/pytorch_model.bin VOCAB SIZE: 30528 Traceback (most recent...

Error occurred running bing_bert/ds_train_bert_nvidia_data_bsz64k_seq128.sh >Detected CUDA files, patching ldflags Emitting ninja build file /home/bduser/.cache/torch_extensions/py38_cu114/fused_lamb/build.ninja... Building extension module fused_lamb... Allowing ninja to set a default number of workers... (overridable by setting...

Follow the bing_bert tutorial, my deepspeed_config is: ```json { "train_batch_size": 4096, "train_micro_batch_size_per_gpu": 32, "steps_per_print": 1000, "prescale_gradients": false, "optimizer": { "type": "Adam", "params": { "lr": 6e-3, "betas": [ 0.9, 0.99 ],...

Collecting the datasets needed for pretraining is a bit of work, especially when downloading from lots of different URLs behind a firewall. https://github.com/microsoft/DeepSpeedExamples/tree/25d73cf73fb3dc66faefa141b7319526555be9fc/Megatron-LM-v1.1.5-ZeRO3#datasets I see that some version of these...

I am trying deepspeed inference with gtpneo-1.3B model. I am using the example [here](https://www.deepspeed.ai/tutorials/inference-tutorial/#end-to-end-gpt-neo-27b-inference) for reference. ``` # Filename: example.py import os import deepspeed import datetime import torch from transformers...

Machine Translation usually takes dynamically sized batch composed of X tokens instead of X sentences as training input. I'm wondering why deepspeed requires specifying `train_batch_size` and `train_micro_batch_size_per_gpu`, both of which...

Hi guys, I have been trying to run the Bing experiment but it seems I can't for now. ``` "datasets": { --   | "wiki_pretrain_dataset": "/data/bert/bnorick_format/128/wiki_pretrain",   | "bc_pretrain_dataset": "/data/bert/bnorick_format/128/bookcorpus_pretrain"   | },...

I am trying to follow the example here https://www.deepspeed.ai/tutorials/bert-pretraining/ The section on getting the datasets says 'Note: Downloading and pre-processing instructions are coming soon.'. I tried googling but those datasets...

To fix the issue reported in this issue: https://github.com/microsoft/DeepSpeed/issues/1243