FastChat
FastChat copied to clipboard
Issue:"Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0"
My GPU is 8*TeslsV100 32G, the software environment is python3.10+cuda11.6+torch2.0.0+ transformers4.28.0.dev0, I run the fine tuning code:
torchrun --nproc_per_node=8 --master_port=20001 fastchat/train/train_mem.py
--model_name_or_path /llama-13b
--data_path alpaca-data-conversation.json
--bf16 True
--output_dir output
--num_train_epochs 3
--per_device_train_batch_size 2
--per_device_eval_batch_size 2
--gradient_accumulation_steps 16
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 1200
--save_total_limit 10
--learning_rate 2e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--fsdp "full_shard auto_wrap"
--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'
--tf32 True
--model_max_length 2048
--gradient_checkpointing True
--lazy_preprocess True
But an error is reported:
ValueError: Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0
How can I fix this?
Change --bf16
to False, V100 does not support bf16.
yes, @sgsdxzy is right. Please re-open the issue if you still see it.
can you tell me how install the flash_attn,thank you
Thanks! I see it
Change
--bf16
to False, V100 does not support bf16.
flash_attn
I directly use the open source docker environment, which contains flash_attn But I think pip3 install should be straightforward, preferably using the -i parameter to specify a pip source
Changing --bf16 to False, did not help me. I got another error ValueError: --tf32 requires Ampere or a newer GPU arch, cuda>=11 and torch>=1.7 Any suggestions ?
Changing --bf16 to False, did not help me. I got another error ValueError: --tf32 requires Ampere or a newer GPU arch, cuda>=11 and torch>=1.7 Any suggestions ?
me too ,how to ?
Set --tf32 to False as well
Am using A100 also got this error. weired...