Benjamin Bossan comments

Results 1181 comments of


                                            Benjamin Bossan

ENH Cache DoRA weight norm for inference

@phemw The PR is ready from my side, if you want to give this a try, LMK what you find. Note that the memory overhead of caching is quite significant...

Add basic PEFT support to train script + record module

> it looks like peft is trying to use the config object as a dict, but it is not a dict. is there a specific version of peft that needs...

When using the ZeRO3 configuration of deepspeed, the target_parameters cannot obtain the shape of the parameters

Thanks for the report. If possible, could you please try out one thing: Go into the PEFT source code and change the method like so: ```python def get_param(self): param =...

When using the ZeRO3 configuration of deepspeed, the target_parameters cannot obtain the shape of the parameters

@qiu-xiao-0330 I tried to reproduce the issue but was unsuccessful so far. Could you please tell me what model you're trying to train?

When using the ZeRO3 configuration of deepspeed, the target_parameters cannot obtain the shape of the parameters

Thanks for the info @qiu-xiao-0330. Unfortunately, that model is too big for me to train (same with 30B), so I tried a smaller variant of the model, `"yujiepan/qwen3-vl-moe-tiny-random"`. For the...

When using the ZeRO3 configuration of deepspeed, the target_parameters cannot obtain the shape of the parameters

> 1.When I used the 30B model, I found that tensor parallelization was not enabled. We haven't checked if/how PEFT targeting layers with tensor parallelism works. Could you ideally share...

CI: Add FSDP tests on multi GPU machine

_not stale_

[Call for contributions] help us improve LoKr, LoHa, and other LyCORIS

Thanks @NouamaneELGueddarii. Check out the quantization support for LoRA, e.g. for [bitsandbytes](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/bnb.py). Also, feel free to open an early draft PR in case you encounter any roadblocks.

[Call for contributions] help us improve LoKr, LoHa, and other LyCORIS

Thanks for the suggestion. So from my understanding, if we wanted to use that functionality, it would involve a complete rewrite of the corresponding adapters. I think this is a...

[Call for contributions] help us improve LoKr, LoHa, and other LyCORIS

What I could envision is to have a separate implementation with a different name. With a separate implementation, we can ensure: - existing code not breaking - not requiring full...