transformers
transformers copied to clipboard
ValueError: Some specified arguments are not used by the HfArgumentParser: ['--local-rank=0']
System Info
transformers version 4.7 , pytorch2.0, python3.9
run the example code in document of transformers
rm -r /tmp/test-clm; CUDA_VISIBLE_DEVICES=0,1 \
python -m torch.distributed.launch --nproc_per_node 2 examples/pytorch/language-modeling/run_clm.py \
--model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \
--do_train --output_dir /tmp/test-clm --per_device_train_batch_size 4 --max_steps 200
error info
/nfs/v100-022/anaconda3/lib/python3.9/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects `--local-rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn(
WARNING:torch.distributed.run:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
Traceback (most recent call last):
File "/nfs/v100-022/run_clm.py", line 772, in <module>
main()
File "/nfs/v100-022/run_clm.py", line 406, in main
model_args, data_args, training_args = parser.parse_args_into_dataclasses()
File "/nfs/v100-022//anaconda3/lib/python3.9/site-packages/transformers/hf_argparser.py", line 341, in parse_args_into_dataclasses
raise ValueError(f"Some specified arguments are not used by the HfArgumentParser: {remaining_args}")
ValueError: Some specified arguments are not used by the HfArgumentParser: ['--local-rank=0']
Who can help?
No response
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [X] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
1.Install the following configuration environment: python 3.9 pytroch 2.1 dev trasnsformers 4.7
- then run code
rm -r /tmp/test-clm; CUDA_VISIBLE_DEVICES=0,1 \
python -m torch.distributed.launch --nproc_per_node 2 examples/pytorch/language-modeling/run_clm.py \
--model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \
--do_train --output_dir /tmp/test-clm --per_device_train_batch_size 4 --max_steps 200
- then you can get error. ValueError: Some specified arguments are not used by the HfArgumentParser: ['--local-rank=0']
Expected behavior
1.Install the following configuration environment: python 3.9 pytroch 2.1 dev trasnsformers 4.7
- then run code
rm -r /tmp/test-clm; CUDA_VISIBLE_DEVICES=0,1 \
python -m torch.distributed.launch --nproc_per_node 2 examples/pytorch/language-modeling/run_clm.py \
--model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \
--do_train --output_dir /tmp/test-clm --per_device_train_batch_size 4 --max_steps 200
- then you can get error. ValueError: Some specified arguments are not used by the HfArgumentParser: ['--local-rank=0']
Hi @bestpredicts, thanks for raising this issue.
I can confirm that I see the same error with the most recent version of transformers and pytorch 2. I wasn't able to replicate the issue with pytorch 1.13.1 and the same transformers version.
Following the messages in the shared error output, if I set LOCAL_RANK
in my environment and pass in --use-env
I am able to run on pytorch 2.
LOCAL_RANK=0,1 CUDA_VISIBLE_DEVICES=0,1 \
python -m torch.distributed.launch --nproc_per_node 2 --use-env examples/pytorch/language-modeling/run_clm.py \
--model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \
--do_train --output_dir /tmp/test-clm --per_device_train_batch_size 4 --max_steps 200
Also note that torch.distributed.launch
is deprecated and torchrun
is preferred in PyTorch 2.0.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Does anyone solved this problem? I got same problem when use torchrun or torch.distributed.launch, the self.local_rank is -1. my env is pytorch==2.0.0 and transorformers=4.30.1.
You might try migrating to torchrun? i.e.:
torchrun --nproc_per_node 2 examples/pytorch/language-modeling/run_clm.py \
--model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \
--do_train --output_dir /tmp/test-clm --per_device_train_batch_size 4 --max_steps 200
for reference on migrating: https://pytorch.org/docs/stable/elastic/run.html
Have you solve your problems? I came up with the same error when using deepspeed. Solutions provided above didn't work at all. :(
另请注意,它
torch.distributed.launch
已被弃用,并且torchrun
在 PyTorch 2.0 中是首选。
Thanks for this tip.
watching
Print from sys.argv
:
['train.py', '--local-rank=0', '--model_name_or_path', './checkpoints/vicuna-7b-v1.5', ...]
other arguments have the format 'key', 'value', but locak_rank
is not properly parsed. In the above example, local_rank=0
is treated as a whole.
I think this may be something wrong with torch.distributed.launch
, since it appends local_rank=0
to the arguments list, but the appended argument can not be properly parsed by HFArgumentParser
.
So use torchrun
and use --use-env
which uses environment variable LOCAL_RANK
but not arguments --local_rank
is an optional solution.
A hack fix can add this before parse_args_into_dataclasses()
import sys
for arg in sys.argv:
if arg.startswith("--local-rank="):
rank = arg.split("=")[1]
sys.argv.remove(arg)
sys.argv.append('--local_rank')
sys.argv.append(rank)
i have this problem
ValueError: Some specified arguments are not used by the HfArgumentParser: ['-f', '/root/.local/share/jupyter/runtime/kernel-8d0db21b-3ec1-4b17-987c-be497d81b3c5.json']
You might try migrating to torchrun? i.e.:
torchrun --nproc_per_node 2 examples/pytorch/language-modeling/run_clm.py \ --model_name_or_path gpt2 --dataset_name wikitext --dataset_config_name wikitext-2-raw-v1 \ --do_train --output_dir /tmp/test-clm --per_device_train_batch_size 4 --max_steps 200
for reference on migrating: https://pytorch.org/docs/stable/elastic/run.html
thanks, it is ok for me
can it run on colab i can't do that
ValueError: Some specified arguments are not used by the HfArgumentParser: ['--only_optimize_lora']