autotrain-advanced
autotrain-advanced copied to clipboard
[BUG] (Duplicate Flag Generation)_ __main__.py: error: unrecognized arguments: --mixed_precision bf16 -m autotrain.trainers.clm
Prerequisites
- [X] I have read the documentation.
- [X] I have checked other issues for similar problems.
Backend
Local
Interface Used
CLI
CLI Command
autotrain app --host 0.0.0.0 --port 7000
UI Screenshots & Parameters
No response
Error Logs
__main__.py: error: unrecognized arguments: --mixed_precision bf16 -m autotrain.trainers.clm --mixed_precision bf16 -m autotrain.trainers.clm --mixed_precision fp16 -m autotrain.trainers.clm --mixed_precision fp16 -m autotrain.trainers.clm
INFO | 2024-10-19 23:01:18 | autotrain.commands:launch_command:524 - {'model': 'unsloth/Qwen2.5-Coder-7B-Instruct', 'project_name': 'autotrain-126tb-pvpyu4', 'data_path': 'skratos115/opendevin_DataDevinator', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 2048, 'model_max_length': 2048, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'lr': 1e-06, 'epochs': 1, 'batch_size': 1, 'warmup_ratio': 0.1, 'gradient_accumulation': 4, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'seed': 42, 'chat_template': 'none', 'quantization': 'int4', 'target_modules': 'all-linear', 'merge_adapter': False, 'peft': True, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'model_ref': None, 'dpo_beta': 0.1, 'max_prompt_length': 128, 'max_completion_length': None, 'prompt_text_column': 'prompt', 'text_column': 'text', 'rejected_text_column': 'rejected_text', 'push_to_hub': True, 'username': 'unclemusclez', 'token': '*****', 'unsloth': True, 'distributed_backend': 'none'}
INFO | 2024-10-19 23:01:18 | autotrain.backends.local:create:25 - Training PID: 57326
INFO: 192.168.2.69:65250 - "POST /ui/create_project HTTP/1.1" 200 OK
INFO: 192.168.2.69:65250 - "GET /ui/is_model_training HTTP/1.1" 200 OK
INFO: 192.168.2.69:65250 - "GET /ui/is_model_training HTTP/1.1" 200 OK
INFO: 192.168.2.69:65250 - "GET /ui/accelerators HTTP/1.1" 200 OK
usage: __main__.py [-h] --training_config TRAINING_CONFIG
__main__.py: error: unrecognized arguments: --mixed_precision bf16 -m autotrain.trainers.clm --mixed_precision bf16 -m autotrain.trainers.clm --mixed_precision fp16 -m autotrain.trainers.clm --mixed_precision fp16 -m autotrain.trainers.clm
Traceback (most recent call last):
File "/usr/local/open-webui/.venv/bin/accelerate", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/accelerate/commands/accelerate_cli.py", line 48, in main
args.func(args)
File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/accelerate/commands/launch.py", line 1174, in launch_command
simple_launcher(args)
File "/usr/local/open-webui/.venv/lib/python3.12/site-packages/accelerate/commands/launch.py", line 769, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/local/open-webui/.venv/bin/python', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-126tb-pvpyu3/training_params.json', '--mixed_precision', 'bf16', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-126tb-pvpyu3/training_params.json', '--mixed_precision', 'bf16', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-126tb-pvpyu3/training_params.json', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-126tb-pvpyu4/training_params.json', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-126tb-pvpyu4/training_params.json']' returned non-zero exit status 2.
INFO: 192.168.2.69:65250 - "GET /ui/is_model_training HTTP/1.1" 200 OK
INFO | 2024-10-19 23:01:34 | autotrain.app.utils:get_running_jobs:40 - Killing PID: 57326
INFO | 2024-10-19 23:01:34 | autotrain.app.utils:kill_process_by_pid:90 - Sent SIGTERM to process with PID 57326
Additional Information
Running Local and it seems to double up the flags, and then keep doing so every time the training is run.