abpani issues

Results 11 issues of


                                            abpani

Cant load llama3 8B into memory

ValueError: Unable to load checkpoint from Meta-Llama-3-8B/original/consolidated.00.pth. while running --config llama3/8B_qlora_single_device

ValueError: Unsupported model type phi3

Phi-3 is not supported.

Cuda memory while finetuning

I am trying to do SFT with a context length of 4096 the same things works perfectly with LLama. The model and cache loading is balanced across all gpus. But...

Model loading is uneven on GPUs with AutomodelforCasualLM

### System Info python 3.10.10 torch 2.3.1 transformers 4.43.2 optimum 1.17.1 auto_gptq 0.7.1 bitsandbytes 0.43.2 accelerate 0.33.0 Llama3.1 8B Instruct gets loaded like this. So I cant even go more...

bug

Uneven model loading

### System Info Hello I am trying to load Mistral-Nemo Instruct-2407 in bnb 4bit on 4 A10 gpus on ec2 instance. I upgraded all the packages. Still I face cuda...

🐛 bug

🏋 SFT

⏳ needs more info

Accelerator.process_index only shows 0 in a 4 GPU env

### System Info ```Shell - `Accelerate` version: 0.34.2 - Platform: Linux-5.15.0-1035-aws-x86_64-with-glibc2.31 - `accelerate` bash location: /home/ubuntu/abpani/FundName/myenv/bin/accelerate - Python version: 3.10.14 - Numpy version: 2.1.1 - PyTorch version (GPU?): 2.4.1+cu121 (True)...

[Bug]: Guided decoding only generating single character during inference with finetuned model

### Your current environment The output of `python collect_env.py` ```text RuntimeWarning: Failed to read commit hash: No module named 'vllm._version' from vllm.version import __version__ as VLLM_VERSION Collecting environment information... PyTorch...

bug

[Bug]: database disk image is malformed

### Your current environment The output of `python collect_env.py` ```text Your output of `python collect_env.py` here ``` ### 🐛 Describe the bug t_tokenizer = AutoTokenizer.from_pretrained(fund_name_model_dir) llm = LLM(model=fund_name_model_dir, quantization="bitsandbytes", load_format="bitsandbytes",...

bug

stale

warnings while converting to onnx

TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated...

[P1] AttributeError: 'Qwen3ForCausalLM' object has no attribute ''

AttributeError: 'Qwen3ForCausalLM' object has no attribute '' while trying to fintune Qwen3 models