DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Failed to run model compression example script for GPT2 in Google Colab

Open Gooogr opened this issue 2 years ago • 4 comments

Hello! I tried to compress GPT2-medium model in Google Colab based on example from the DeepSpeedExamples/model_compression/gpt2 folder.

My code splitted by cells is:

! git clone https://github.com/microsoft/DeepSpeedExamples.git

%cd /content/DeepSpeedExamples/model_compression/gpt2/
! pip install -r requirements.txt >> pip_log.txt
! pip install deepspeed >> pip_log.txt

!bash ./bash_script/run_zero_quant.sh

The only change I made to the script was to use gpt2-medium instead of gpt2-large because it didn't fit into the memory on the GPU. Unfortunately, I got RuntimeError: Tensors must be contiguous

Full log:

/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  FutureWarning,
[2022-09-06 10:24:04,020] [INFO] [comm.py:635:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
09/06/2022 10:24:04 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 0
09/06/2022 10:24:04 - INFO - torch.distributed.distributed_c10d - Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
09/06/2022 10:24:06 - WARNING - datasets.builder - Reusing dataset wikitext (/root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126)
100% 3/3 [00:00<00:00, 1065.45it/s]
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-058185f8d35b06d0.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-b7f9678a1a1f8fec.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-a20b4ea2a2819e70.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-fa4760df6c270a3f.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-d9c7aa99174f6980.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-0bd76767d49c3757.arrow
***** Running training *****
  Num examples = 2318
  Num Epochs = 0
  Instantaneous batch size per device = 4
  Gradient Accumulation steps = 1
  Total optimization steps = 0
Number of parameters: 354823168
Before converting the module COVN1D to linear, and before applying init_compression: 22.3755029232731
[2022-09-06 10:25:32,158] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,158] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,228] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,228] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,230] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,230] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,230] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
WARNING: saving the quantized model with Linear Module instead of COV1D
[2022-09-06 10:25:32,241] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.2, git-hash=unknown, git-branch=unknown
[2022-09-06 10:25:32,243] [INFO] [comm.py:629:init_distributed] Distributed backend already initialized
09/06/2022 10:25:32 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 0
09/06/2022 10:25:32 - INFO - torch.distributed.distributed_c10d - Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 1 nodes.
Traceback (most recent call last):
  File "run_clm_no_trainer.py", line 544, in <module>
    main()
  File "run_clm_no_trainer.py", line 528, in main
    training(model, train_dataloader, eval_dataloader, args.num_train_epochs, args)
  File "run_clm_no_trainer.py", line 482, in training
    dist_init_required=True)
  File "/usr/local/lib/python3.7/dist-packages/deepspeed/__init__.py", line 134, in initialize
    config_params=config_params)
  File "/usr/local/lib/python3.7/dist-packages/deepspeed/runtime/engine.py", line 288, in __init__
    self._configure_distributed_model(model)
  File "/usr/local/lib/python3.7/dist-packages/deepspeed/runtime/engine.py", line 1084, in _configure_distributed_model
    self._broadcast_model()
  File "/usr/local/lib/python3.7/dist-packages/deepspeed/runtime/engine.py", line 997, in _broadcast_model
    group=self.data_parallel_group)
  File "/usr/local/lib/python3.7/dist-packages/deepspeed/comm/comm.py", line 126, in log_wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/deepspeed/comm/comm.py", line 231, in broadcast
    return cdb.broadcast(tensor=tensor, src=src, group=group, async_op=async_op)
  File "/usr/local/lib/python3.7/dist-packages/deepspeed/comm/torch.py", line 73, in broadcast
    async_op=async_op)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 1197, in broadcast
    work = group.broadcast([tensor], opts)
RuntimeError: Tensors must be contiguous
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 549) of binary: /usr/bin/python3
Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 755, in run
    )(*cmd_args)
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 247, in launch_agent
    failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
run_clm_no_trainer.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2022-09-06_10:25:37
  host      : cd6171fcbfc3
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 549)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Gooogr avatar Sep 06 '22 10:09 Gooogr

I encountered the same issue. frustrating.

hudengjunai avatar Nov 08 '22 10:11 hudengjunai

I tried both gpt2-medium and gpt2(gpt-small), all fialed with your same erro. tensor is_contiguous False.

hudengjunai avatar Nov 08 '22 10:11 hudengjunai

I have find the bug. and solved it. https://github.com/microsoft/DeepSpeed/blob/521d329b975de97ec0b52395f02bb32466b8dc35/deepspeed/compression/helper.py#L275 change to

new_module.weight.data = old_module.weight.data.t().contiguous()

hudengjunai avatar Nov 09 '22 03:11 hudengjunai

Is there a better way to address this than to search all your scripts for an operation that returns a non-contiguous tensor?

FarzanT avatar Dec 15 '22 02:12 FarzanT