swift win10训练qwen1.5-moe-A2.7B-chat-gptq-int4速度缓慢

win10训练qwen1.5-moe-A2.7B-chat-gptq-int4速度缓慢

Open catundchat opened this issue 2 months ago • 0 comments

Describe the bug 在powershell中运行下列命令

D:\github\ENV\qwen\Scripts\python.exe d:\github\swift\swift\cli\sft.py `
    --model_type qwen1half-moe-a2_7b-chat-int4 `
    --model_id_or_path "D:\models\Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4" `
    --sft_type lora `
    --dtype AUTO `
    --output_dir "D:\github\swift\output" `
    --train_dataset_sample -1 `
    --num_train_epochs 3 `
    --max_length 1024 `
    --check_dataset_strategy warning `
    --lora_rank 8 `
    --lora_alpha 32 `
    --lora_dropout_p 0.05 `
    --lora_target_modules ALL `
    --gradient_checkpointing true `
    --batch_size 1 `
    --weight_decay 0.1 `
    --learning_rate 2e-5 `
    --gradient_accumulation_steps 16 `
    --max_grad_norm 1.0 `
    --warmup_ratio 0.03 `
    --eval_steps 50 `
    --save_steps 50 `
    --save_total_limit 3 `
    --logging_steps 10 `
    --use_flash_attn false `
    --self_cognition_sample 1000 `
    --model_name 风语诗人 'FengPoet' `
    --model_author 'Geoffery' `
    --custom_train_dataset_path "D:\dataset\poems_processed\rhyme\train_d1.jsonl" `
    --custom_val_dataset_path "D:\dataset\poems_processed\rhyme\val_d1.jsonl"

发现训练缓慢需要大概七十多小时。

[INFO:swift] Model file config.json is different from the latest version `master`,This is because you are using an older version or the file is updated manually.
[INFO:swift] The SftArguments will be saved in: D:\github\swift\output\qwen1half-moe-a2_7b-chat-int4\v0-20240426-150438\sft_args.json
[INFO:swift] The Seq2SeqTrainingArguments will be saved in: D:\github\swift\output\qwen1half-moe-a2_7b-chat-int4\v0-20240426-150438\training_args.json
[INFO:swift] The logging file will be saved in: D:\github\swift\output\qwen1half-moe-a2_7b-chat-int4\v0-20240426-150438\logging.jsonl
Train:   0%|                                                                                                                                           | 0/3936 [00:00<?, ?it/s]2024-04-26 15:06:06,967 - modelscope - INFO - PyTorch version 2.3.0+cu118 Found.
2024-04-26 15:06:06,967 - modelscope - INFO - Loading ast index from C:\Users\Administrator\.cache\modelscope\ast_indexer
2024-04-26 15:06:07,013 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 81f3d6fd46847ddcf779e2d1e42341be and a total number of 976 components indexed
D:\github\ENV\qwen\lib\site-packages\transformers\models\qwen2_moe\modeling_qwen2_moe.py:775: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)
  attn_output = torch.nn.functional.scaled_dot_product_attention(
{'loss': 5.03665495, 'acc': 0.35384867, 'grad_norm': nan, 'learning_rate': 0.0, 'epoch': 0.0, 'global_step': 1}
Train:   0%|                                                                                                                                | 2/3936 [02:21<76:37:20, 70.12s/it]

Your hardware and system info cuda:11.8, win10系统, GPU: RTX A5000, torch version: 2.3.0+cu118

Additional context 训练开始时提示bitsandbytes未安装，于是安装了windows版本b2b python -m pip install bitsandbytes --prefer-binary --extra-index-url=https://jllllll.github.io/bitsandbytes-windows-webui ，不明白是哪里出了问题，期待回复感谢

Apr 26 '24 07:04 catundchat

swift swift copied to clipboard

win10训练qwen1.5-moe-A2.7B-chat-gptq-int4速度缓慢

swift
swift copied to clipboard