autotrain-advanced icon indicating copy to clipboard operation
autotrain-advanced copied to clipboard

[BUG] Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes

Open matthewfarant opened this issue 1 year ago • 14 comments

Prerequisites

  • [X] I have read the documentation.
  • [X] I have checked other issues for similar problems.

Backend

Local

Interface Used

UI

CLI Command

No response

UI Screenshots & Parameters

No response

Error Logs

ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes

❌ ERROR | 2024-02-26 10:11:57 | autotrain.trainers.common:wrapper:92 - Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes 🚀 INFO | 2024-02-26 10:11:57 | autotrain.trainers.common:pause_space:49 - Pausing space...

Additional Information

No response

matthewfarant avatar Feb 26 '24 10:02 matthewfarant

please paste params used and model

abhishekkrthakur avatar Feb 26 '24 10:02 abhishekkrthakur

Hi @abhishekkrthakur , these are the details:

Task = LLM SFT Model = mistralai/Mixtral-8x7B-Instruct-v0.1

{
  "block_size": 1024,
  "model_max_length": 2048,
  "padding": "right",
  "use_flash_attention_2": false,
  "disable_gradient_checkpointing": false,
  "logging_steps": -1,
  "evaluation_strategy": "epoch",
  "save_total_limit": 1,
  "save_strategy": "epoch",
  "auto_find_batch_size": false,
  "mixed_precision": "fp16",
  "lr": 0.00003,
  "epochs": 3,
  "batch_size": 2,
  "warmup_ratio": 0.1,
  "gradient_accumulation": 1,
  "optimizer": "adamw_torch",
  "scheduler": "linear",
  "weight_decay": 0,
  "max_grad_norm": 1,
  "seed": 42,
  "chat_template": "none",
  "quantization": "int4",
  "target_modules": "all-linear",
  "merge_adapter": false,
  "peft": true,
  "lora_r": 16,
  "lora_alpha": 32,
  "lora_dropout": 0.05
}

matthewfarant avatar Feb 26 '24 15:02 matthewfarant

are you running it on windows? could you please tell me how you installed autotrain?

abhishekkrthakur avatar Feb 26 '24 16:02 abhishekkrthakur

I'm running it on Autotrain UI in HuggingFace spaces @abhishekkrthakur (I chose Autotrain's docker template when building the HF space)

matthewfarant avatar Feb 27 '24 04:02 matthewfarant

same error, it's running on Autotrain UI, i removed "mixed_precision": "fp16" as the space running on CPU using google/gemma model

parameters:

{
  "block_size": 1024,
  "model_max_length": 2048,
  "padding": "right",
  "use_flash_attention_2": false,
  "disable_gradient_checkpointing": false,
  "logging_steps": -1,
  "evaluation_strategy": "epoch",
  "save_total_limit": 1,
  "save_strategy": "epoch",
  "auto_find_batch_size": false,
  "lr": 0.00003,
  "epochs": 3,
  "batch_size": 2,
  "warmup_ratio": 0.1,
  "gradient_accumulation": 1,
  "optimizer": "adamw_torch",
  "scheduler": "linear",
  "weight_decay": 0,
  "max_grad_norm": 1,
  "seed": 42,
  "chat_template": "none",
  "quantization": "int4",
  "target_modules": "all-linear",
  "merge_adapter": false,
  "peft": true,
  "lora_r": 16,
  "lora_alpha": 32,
  "lora_dropout": 0.05
}

flameface avatar Feb 28 '24 04:02 flameface

you should not remove any params. if you dont want mixed precision, set it to none:

mixed_precision: "none"

abhishekkrthakur avatar Feb 29 '24 07:02 abhishekkrthakur

image

Still same error

flameface avatar Feb 29 '24 10:02 flameface

taking a look!

abhishekkrthakur avatar Feb 29 '24 10:02 abhishekkrthakur

hello?

flameface avatar Mar 03 '24 09:03 flameface

have you tried after that? some packages were updated this week. please factory rebuild your autotrain space before trying it.

abhishekkrthakur avatar Mar 03 '24 09:03 abhishekkrthakur

have you tried after that? some packages were updated this week. please factory rebuild your autotrain space before trying it.

Still getting the error as of current.

dragonAZH avatar Mar 03 '24 19:03 dragonAZH

image

Still same.

flameface avatar Mar 04 '24 03:03 flameface

I'm using google/gemma-7b, will you try it.

Training Data: (data.csv)

text
"human: hello \n bot: id-chat hi nice to meet you"
"human: how are you \n bot: id-chat I am fine"
"human: generate an image of a cat \n bot: id-image a cute furry cat"

Column mapping:-

{"text": "text"}

flameface avatar Mar 04 '24 03:03 flameface

I get this same dependency issue please provide a fix OR | 2024-03-04 11:17:08 | autotrain.trainers.common:wrapper:91 - train has failed due to an exception: Traceback (most recent call last): File "/app/env/lib/python3.10/site-packages/autotrain/trainers/common.py", line 88, in wrapper return func(args, kwargs) File "/app/env/lib/python3.10/site-packages/autotrain/trainers/clm/main.py", line 230, in train model = AutoModelForCausalLM.from_pretrained( File "/app/env/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained return model_class.from_pretrained( File "/app/env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3024, in from_pretrained hf_******.validate_environment( File "/app/env/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 62, in validate_environment raise ImportError( ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes

❌ ERROR | 2024-03-04 11:17:08 | autotrain.trainers.common:wrapper:92 - Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes 🚀 INFO | 2024-03-04 11:17:08 | autotrain.trainers.common:pause_space:49 - Pausing space...

sivakmar avatar Mar 04 '24 11:03 sivakmar

@abhishekkrthakur I am receiving the same error when running it on google colab: image

ghost avatar Mar 08 '24 03:03 ghost

@SyntaxPratyush Same here, @abhishekkrthakur can you please look into the same..

SrushtiAckno avatar Mar 08 '24 04:03 SrushtiAckno

I also encountered the same issue

lkk117766 avatar Mar 08 '24 07:03 lkk117766

Someone says we need to downgrade transformers library to version 4.30, on order to fix this error

However, GemmerTokenizer need to upgrade transformers to version 4.38 ... !!

lkk117766 avatar Mar 08 '24 07:03 lkk117766

taking a look again.

abhishekkrthakur avatar Mar 08 '24 07:03 abhishekkrthakur

I spun up a new autotrain space, added a10g gpu and i am able to train mistralai/Mistral-7B-v0.1 successfully. do you have this issue with a specific gpu or a specific model?

image

abhishekkrthakur avatar Mar 08 '24 08:03 abhishekkrthakur

@abhishekkrthakur Could you please show me a detailed tutorial on how to do it on autotrain-advanced as there are no proper explanations on how to do it, I am having specific issues on finding the proper format for train.csv & the column mapping as write know I am getting Error-500: Check Logs for more Info, and the logs are empty

ghost avatar Mar 08 '24 10:03 ghost

@SyntaxPratyush here is a train.csv for llm task that you can try with: https://github.com/huggingface/autotrain-example-datasets/blob/main/alpaca1k.csv

abhishekkrthakur avatar Mar 08 '24 10:03 abhishekkrthakur

@abhishekkrthakur column mapping pls

ghost avatar Mar 08 '24 11:03 ghost

you dont need to change anything in column mapping if you use that file. also, lets not hijack this thread as its a completely different issue. you can post your queries in huggingface forums and i can help there.

abhishekkrthakur avatar Mar 08 '24 11:03 abhishekkrthakur

ok thanks

ghost avatar Mar 08 '24 11:03 ghost

image while running image

ghost avatar Mar 08 '24 11:03 ghost

which gpu did you use?

abhishekkrthakur avatar Mar 08 '24 11:03 abhishekkrthakur

i have a local Radeon Pro 575 and chose the free cpu at the beginning

ghost avatar Mar 08 '24 11:03 ghost

you cannot use peft and quantization on cpu. please select appropriate gpu. e.g. A10g

abhishekkrthakur avatar Mar 08 '24 11:03 abhishekkrthakur

im closing this issue as its deviating a lot from the title and the originally reported issue doesnt exist. the error appears because users are trying to train gpu models on a cpu machine.

abhishekkrthakur avatar Mar 08 '24 11:03 abhishekkrthakur