llm-course
llm-course copied to clipboard
4-bit LLM Quantization with GPTQ Tokenizer stuck
I'm trying to run the 4-bit LLM Quantization with GPTQ notebook with my own fine-tuned Llama2 7b model. However, it is getting stuck at the tokenizer step:
tokenized_data = tokenizer("\n\n".join(data['text']), return_tensors='pt')
I already tried using the tokenizer from the merged fine-tune model as well as the tokenizer from the llama2 repo. However, it still hangs on this step. Would appreciate any help or tips on how to fix this.
Can you send me the exact error you're getting?