Benjamin Bossan comments

Results 795 comments of


                                            Benjamin Bossan

Improving generalization of LoRA with wise-ft

> I like the renaming and docstring suggestion. > I can open a PR for the same. That would be great, thanks. > I'm running it on a macbook with...

Improving generalization of LoRA with wise-ft

Hmm, this is super strange. I double checked, my graphs look identical with and without sleep (tested both CUDA and CPU). Maybe this is an MPS-specific issue? Perhaps we should...

Improving generalization of LoRA with wise-ft

Interesting. If anyone else could try on their machine, so that we can collect more data on this issue, it would be great. Anyway, for now I guess the best...

Improving generalization of LoRA with wise-ft

> do you know when will be the next release of PEFT ? I'd like to present this feature during the KDD conference on 28th August There is no concrete...

LORA adapter with new embedding tokens has size and load issue

As a general remark: Gemma is unusual in that it uses a relatively big vocabulary. Therefore, the embedding layer is particularly large. This is especially noticeable with the smaller Gemma...

LORA adapter with new embedding tokens has size and load issue

For the question whether the embedding is saved as part of the checkpoint, setting `ensure_weight_tying` makes no difference. Note that if you have `modules_to_save=["embed_tokens"]`, it is required to save the...

bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming

> The code I wrote here should enable all PEFT checkpoints that are linked to models in the VLMs list (explicit mapping dictionary and remapping is specified) to function. I'm...

bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming

Thanks for the pointer. I ran my own tests to check all directions, using transformers `v4.49.0` as the old state, which should be from before the mentioned PR, and transformers...

bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming

> I'm guessing you could take the same list (called VLMs), and apply the mapping from the base model in the same way automatically ? Yes, this is true, I...

bugfix: propage weight key_mapping to peft to fix 3.52 VLM renaming

I created a PR to address this in PEFT: https://github.com/huggingface/peft/pull/2574. I used fake mini models to mimic the old and new model architectures for testing. The tests are still failing...