qlora
qlora copied to clipboard
AttributeError: 'tuple' object has no attribute 'load_in_8bit' while trying inference
I'm using the following code to test the inference pipeline, but I'm getting this error
File "/home/envs/qlora_env/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2209, in from_pretrained
load_in_8bit = quantization_config.load_in_8bit
AttributeError: 'tuple' object has no attribute 'load_in_8bit'
I'm getting the same error whether I set
adapter_path = '/home/qlora/output2/checkpoint-100/adapter_model'
or
adapter_path = '/home/qlora/output2/checkpoint-100/'
I have included the contents of both of these folders at the bottom. I've included the changes suggested in #44 and here
from collections import defaultdict
import copy
import json
import os
from os.path import exists, join, isdir
from dataclasses import dataclass, field
import sys
from typing import Optional, Dict, Sequence
import numpy as np
from tqdm import tqdm
import logging
import bitsandbytes as bnb
import torch
import transformers
from torch.nn.utils.rnn import pad_sequence
import argparse
from transformers import (
AutoTokenizer,
AutoModelForCausalLM,
set_seed,
Seq2SeqTrainer,
BitsAndBytesConfig,
LlamaTokenizerFast
)
from datasets import load_dataset
import evaluate
from peft import (
prepare_model_for_int8_training,
LoraConfig,
get_peft_model,
get_peft_model_state_dict,
PeftModel
)
from peft.tuners.lora import LoraLayer
from transformers.trainer_utils import PREFIX_CHECKPOINT_DIR
torch.backends.cuda.matmul.allow_tf32 = True
model_path = '/home/huggyllama/llama-7b'
adapter_path = '/home/qlora/output2/checkpoint-100/adapter_model'
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type='nf4'
),
model = AutoModelForCausalLM.from_pretrained(
model_path,
low_cpu_mem_usage=True,
load_in_4bit=True,
quantization_config=quantization_config,
torch_dtype=torch.float16,
device_map='auto'
)
model = PeftModel.from_pretrained(model, adapter_path)
# model = model.merge_and_unload()
print(model)
This is the output of ls -la output2/checkpoint-100
folder
total 3299968 drwxr-xr-x 3 work users 4096 May 29 05:13 . drwxr-xr-x 5 work users 4096 May 29 05:20 .. drwxr-xr-x 2 work users 4096 May 29 05:13 adapter_model -rw-r--r-- 1 work users 21 May 29 05:13 added_tokens.json -rw-r--r-- 1 work users 3376758277 May 29 05:13 optimizer.pt -rw-r--r-- 1 work users 14575 May 29 05:13 rng_state.pth -rw-r--r-- 1 work users 627 May 29 05:13 scheduler.pt -rw-r--r-- 1 work users 96 May 29 05:13 special_tokens_map.json -rw-r--r-- 1 work users 1842947 May 29 05:13 tokenizer.json -rw-r--r-- 1 work users 499723 May 29 05:13 tokenizer.model -rw-r--r-- 1 work users 727 May 29 05:13 tokenizer_config.json -rw-r--r-- 1 work users 1604 May 29 05:13 trainer_state.json -rw-r--r-- 1 work users 5627 May 29 05:13 training_args.bin
This is the output of ls -la output2/checkpoint-100/adapter_model
folder
total 624816 drwxr-xr-x 2 work users 4096 May 29 05:13 . drwxr-xr-x 3 work users 4096 May 29 05:13 .. -rw-r--r-- 1 work users 450 May 29 05:13 adapter_config.json -rw-r--r-- 1 work users 639792909 May 29 05:13 adapter_model.bin
any suggestions why I'm facing this issue?
Try removing the extraneous comma from the end of the line that initializes the quantization config. That's turning the quantization config into a tuple whose first element is the qconfig. Got to love Python.