stanford_alpaca ValueError: Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0

ValueError: Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0

Open GUORUIWANG opened this issue 2 years ago • 10 comments

Is Ampere GPU with cuda 11.0 a necessary condition? How to solve this error? Thank you

Mar 24 '23 01:03 GUORUIWANG

I have encountered the same problem too.

Mar 24 '23 06:03 aceai84

Is Ampere GPU with cuda 11.0 a necessary condition? How to solve this error? Thank you

do you use v100? It doesn't support bf16

Mar 24 '23 11:03 daiyongya

Change bf16 to fp16 for non Ampere GPUs

Mar 25 '23 18:03 bingjie3216

How?

Jun 02 '23 11:06 peter-ch

My code the below

from transformers import TrainingArguments

args = TrainingArguments(
    output_dir="llama-7-int4-dolly",
    num_train_epochs=3,
    per_device_train_batch_size=6 if use_flash_attention else 4,
    gradient_accumulation_steps=2,
    gradient_checkpointing=True,
    optim="paged_adamw_32bit",
    logging_steps=10,
    save_strategy="epoch",
    learning_rate=2e-4,
    bf16=True,
    tf32=True,
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    lr_scheduler_type="constant",
    disable_tqdm=True # disable tqdm since with packing values are in correct
)

I have changed them to below and it works.