qlora AttributeError: 'tuple' object has no attribute 'load_in

I'm using the following code to test the inference pipeline, but I'm getting this error

File "/home/envs/qlora_env/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2209, in from_pretrained
load_in_8bit = quantization_config.load_in_8bit
AttributeError: 'tuple' object has no attribute 'load_in_8bit'

I'm getting the same error whether I set

adapter_path = '/home/qlora/output2/checkpoint-100/adapter_model'

or

adapter_path = '/home/qlora/output2/checkpoint-100/'

I have included the contents of both of these folders at the bottom. I've included the changes suggested in #44 and here

from collections import defaultdict
import copy
import json
import os
from os.path import exists, join, isdir
from dataclasses import dataclass, field
import sys
from typing import Optional, Dict, Sequence
import numpy as np
from tqdm import tqdm
import logging
import bitsandbytes as bnb

import torch
import transformers
from torch.nn.utils.rnn import pad_sequence
import argparse
from transformers import (
    AutoTokenizer, 
    AutoModelForCausalLM, 
    set_seed, 
    Seq2SeqTrainer,
    BitsAndBytesConfig,
    LlamaTokenizerFast

)
from datasets import load_dataset
import evaluate

from peft import (
    prepare_model_for_int8_training,
    LoraConfig,
    get_peft_model,
    get_peft_model_state_dict,
    PeftModel
)
from peft.tuners.lora import LoraLayer
from transformers.trainer_utils import PREFIX_CHECKPOINT_DIR


torch.backends.cuda.matmul.allow_tf32 = True



model_path = '/home/huggyllama/llama-7b' 
adapter_path = '/home/qlora/output2/checkpoint-100/adapter_model'
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type='nf4'
),
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    low_cpu_mem_usage=True,
    load_in_4bit=True,
    quantization_config=quantization_config,
    torch_dtype=torch.float16,
    device_map='auto'
)
model = PeftModel.from_pretrained(model, adapter_path)
# model = model.merge_and_unload()
print(model)

This is the output of ls -la output2/checkpoint-100 folder

total 3299968 drwxr-xr-x 3 work users 4096 May 29 05:13 . drwxr-xr-x 5 work users 4096 May 29 05:20 .. drwxr-xr-x 2 work users 4096 May 29 05:13 adapter_model -rw-r--r-- 1 work users 21 May 29 05:13 added_tokens.json -rw-r--r-- 1 work users 3376758277 May 29 05:13 optimizer.pt -rw-r--r-- 1 work users 14575 May 29 05:13 rng_state.pth -rw-r--r-- 1 work users 627 May 29 05:13 scheduler.pt -rw-r--r-- 1 work users 96 May 29 05:13 special_tokens_map.json -rw-r--r-- 1 work users 1842947 May 29 05:13 tokenizer.json -rw-r--r-- 1 work users 499723 May 29 05:13 tokenizer.model -rw-r--r-- 1 work users 727 May 29 05:13 tokenizer_config.json -rw-r--r-- 1 work users 1604 May 29 05:13 trainer_state.json -rw-r--r-- 1 work users 5627 May 29 05:13 training_args.bin

This is the output of ls -la output2/checkpoint-100/adapter_model folder

total 624816 drwxr-xr-x 2 work users 4096 May 29 05:13 . drwxr-xr-x 3 work users 4096 May 29 05:13 .. -rw-r--r-- 1 work users 450 May 29 05:13 adapter_config.json -rw-r--r-- 1 work users 639792909 May 29 05:13 adapter_model.bin

May 28 '23 21:05 amdnsr

any suggestions why I'm facing this issue?

May 29 '23 08:05 amdnsr

Try removing the extraneous comma from the end of the line that initializes the quantization config. That's turning the quantization config into a tuple whose first element is the qconfig. Got to love Python.

May 30 '23 22:05 ericaltendorf

qlora
qlora copied to clipboard

AttributeError: 'tuple' object has no attribute 'load_in_8bit' while trying inference

qlora qlora copied to clipboard

AttributeError: 'tuple' object has no attribute 'load_in_8bit' while trying inference

qlora
qlora copied to clipboard