Benjamin Bossan

Results 819 comments of Benjamin Bossan

Sorry for the delay. Zach is currently out of office but I'm sure he'll look into it when he's back.

@muellerzr Could you check on this issue again?

It's true that we don't explicitly list huggingface_hub as a requirement, but it's an indirect requirement (e.g. it's a requirement of accelerate which is a requirement of PEFT). For your...

Yeah, it's the same issue most likely, with the fix I mentioned above. What version of `huggingface_hub` are you using? Could you try downgrading it?

For this to work with bitsandbytes, you need to implement a different class of layers specific for quantized weights, check for instance this in PEFT: https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/bnb.py However, be aware that...

Indeed, using LoRA does not necessarily reduce the training time. On the one hand, there are less gradients, which should help, on the other hand, LoRA adds extra computation, which...

> Therefore, while LoRA reduces memory consumption, it does not decrease training time. I agree, this is the main goal. However, I find that in practice, LoRA often reduces training...

@Whadup Do you still plan on working on this?

It's not quite clear to me, but it appears like AutoAWQ will be integrated into [llm-compressor](https://github.com/vllm-project/llm-compressor): > AutoAWQ Integration: Perform low-bit weight-only quantization efficiently using AutoAWQ, now part of LLM...