DeepSpeedExamples
DeepSpeedExamples copied to clipboard
Failed to run model compression example script for GPT2 in Google Colab
Hello! I tried to compress GPT2-medium model in Google Colab based on example from the DeepSpeedExamples/model_compression/gpt2
folder.
My code splitted by cells is:
! git clone https://github.com/microsoft/DeepSpeedExamples.git
%cd /content/DeepSpeedExamples/model_compression/gpt2/
! pip install -r requirements.txt >> pip_log.txt
! pip install deepspeed >> pip_log.txt
!bash ./bash_script/run_zero_quant.sh
The only change I made to the script was to use gpt2-medium instead of gpt2-large because it didn't fit into the memory on the GPU.
Unfortunately, I got RuntimeError: Tensors must be contiguous
Full log:
/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py:186: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
FutureWarning,
[2022-09-06 10:24:04,020] [INFO] [comm.py:635:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
09/06/2022 10:24:04 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:1 to store for rank: 0
09/06/2022 10:24:04 - INFO - torch.distributed.distributed_c10d - Rank 0: Completed store-based barrier for key:store_based_barrier_key:1 with 1 nodes.
09/06/2022 10:24:06 - WARNING - datasets.builder - Reusing dataset wikitext (/root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126)
100% 3/3 [00:00<00:00, 1065.45it/s]
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-058185f8d35b06d0.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-b7f9678a1a1f8fec.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-a20b4ea2a2819e70.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-fa4760df6c270a3f.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-d9c7aa99174f6980.arrow
09/06/2022 10:24:14 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/a241db52902eaf2c6aa732210bead40c090019a499ceb13bcbfa3f8ab646a126/cache-0bd76767d49c3757.arrow
***** Running training *****
Num examples = 2318
Num Epochs = 0
Instantaneous batch size per device = 4
Gradient Accumulation steps = 1
Total optimization steps = 0
Number of parameters: 354823168
Before converting the module COVN1D to linear, and before applying init_compression: 22.3755029232731
[2022-09-06 10:25:32,158] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,158] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,159] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,160] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,161] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,162] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,163] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,164] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,165] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,228] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,228] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,229] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,230] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,230] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
[2022-09-06 10:25:32,230] [WARNING] [basic_layer.py:354:enable_weight_quantization] ************ A lot of MoQ features are not supported in quantize_weight_in_forward mode, please consider to use DS-FP16 optimizer************
WARNING: saving the quantized model with Linear Module instead of COV1D
[2022-09-06 10:25:32,241] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.2, git-hash=unknown, git-branch=unknown
[2022-09-06 10:25:32,243] [INFO] [comm.py:629:init_distributed] Distributed backend already initialized
09/06/2022 10:25:32 - INFO - torch.distributed.distributed_c10d - Added key: store_based_barrier_key:2 to store for rank: 0
09/06/2022 10:25:32 - INFO - torch.distributed.distributed_c10d - Rank 0: Completed store-based barrier for key:store_based_barrier_key:2 with 1 nodes.
Traceback (most recent call last):
File "run_clm_no_trainer.py", line 544, in <module>
main()
File "run_clm_no_trainer.py", line 528, in main
training(model, train_dataloader, eval_dataloader, args.num_train_epochs, args)
File "run_clm_no_trainer.py", line 482, in training
dist_init_required=True)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/__init__.py", line 134, in initialize
config_params=config_params)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/runtime/engine.py", line 288, in __init__
self._configure_distributed_model(model)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/runtime/engine.py", line 1084, in _configure_distributed_model
self._broadcast_model()
File "/usr/local/lib/python3.7/dist-packages/deepspeed/runtime/engine.py", line 997, in _broadcast_model
group=self.data_parallel_group)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/comm/comm.py", line 126, in log_wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/comm/comm.py", line 231, in broadcast
return cdb.broadcast(tensor=tensor, src=src, group=group, async_op=async_op)
File "/usr/local/lib/python3.7/dist-packages/deepspeed/comm/torch.py", line 73, in broadcast
async_op=async_op)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/distributed_c10d.py", line 1197, in broadcast
work = group.broadcast([tensor], opts)
RuntimeError: Tensors must be contiguous
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 549) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 193, in <module>
main()
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/run.py", line 755, in run
)(*cmd_args)
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.7/dist-packages/torch/distributed/launcher/api.py", line 247, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
run_clm_no_trainer.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2022-09-06_10:25:37
host : cd6171fcbfc3
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 549)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
I encountered the same issue. frustrating.
I tried both gpt2-medium and gpt2(gpt-small), all fialed with your same erro. tensor is_contiguous False.
I have find the bug. and solved it. https://github.com/microsoft/DeepSpeed/blob/521d329b975de97ec0b52395f02bb32466b8dc35/deepspeed/compression/helper.py#L275 change to
new_module.weight.data = old_module.weight.data.t().contiguous()
Is there a better way to address this than to search all your scripts for an operation that returns a non-contiguous tensor?