oslo
oslo copied to clipboard
'TrainingArguments' object has no attribute 'parallel_mode' when running mBart test
How to reproduce
python ./tests/transformers/models/mbart/test_training.py
Environment
- OS : CentOS 7.9
- Python version : 3.9
- Transformers version : 4.21.2
- Whether to use Docker:
- Misc.:
python ./tests/transformers/models/mbart/test_training.py Reusing dataset glue (/root/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
100%|███████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 682.67it/s]
100%|██████████████████████████████████████████████████████████████████████████████| 68/68 [00:01<00:00, 52.15ba/s]
100%|████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 43.15ba/s]
100%|████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 42.94ba/s]
You are using a model of type bart to instantiate a model of type mbart. This is not supported for all configurations of models and can yield errors.
Some weights of MBartForConditionalGeneration were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['encoder.layer_norm.bias', 'decoder.layer_norm.weight', 'encoder.layer_norm.weight', 'decoder.layer_norm.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
You are using a model of type bart to instantiate a model of type mbart. This is not supported for all configurations of models and can yield errors.
Some weights of MBartForConditionalGeneration were not initialized from the model checkpoint at facebook/bart-base and are newly initialized: ['encoder.layer_norm.bias', 'decoder.layer_norm.weight', 'encoder.layer_norm.weight', 'decoder.layer_norm.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
PyTorch: setting up devices
Traceback (most recent call last):
File "./tests/transformers/models/mbart/test_training.py", line 94, in <module>
fp16=False,
File "./tests/transformers/models/mbart/test_training.py", line 44, in train
eval_dataset=dataset["validation"],
File "/opt/conda/lib/python3.7/site-packages/oslo_core-3.0.0-py3.7.egg/oslo/transformers/trainer.py", line 186, in __init__
if len(args.parallel_mode) > 0:
AttributeError: 'TrainingArguments' object has no attribute 'parallel_mode'
The problem seems to be the parallel_mode
property in training_args.py
is commented, line 989
# @property # def parallel_mode(self): # """ # The current mode used for parallelism if multiple GPUs/TPU cores are available. One of: # # -
ParallelMode.NOT_PARALLEL: no parallelism (CPU or one GPU). # -
ParallelMode.NOT_DISTRIBUTED: several GPUs in one single process (uses
torch.nn.DataParallel). # -
ParallelMode.DISTRIBUTED: several GPUs, each having its own process (uses #
torch.nn.DistributedDataParallel). # -
ParallelMode.TPU: several TPU cores. # """ # # if is_torch_tpu_available(): # # return ParallelMode.TPU # # elif is_sagemaker_mp_enabled(): # # return ParallelMode.SAGEMAKER_MODEL_PARALLEL # # elif is_sagemaker_dp_enabled(): # # return ParallelMode.SAGEMAKER_DATA_PARALLEL # if self.local_rank != -1: # return ParallelMode.DISTRIBUTED # elif self.n_gpu > 1: # return ParallelMode.NOT_DISTRIBUTED # else: # return ParallelMode.NOT_PARALLEL
currently, the trainer module is not ready to use. I'll let you know when this becomes available. thanks.