Benjamin Bossan comments

Results 1181 comments of


                                            Benjamin Bossan

Comparison of Different Fine-Tuning Techniques for Conversational AI

> My understanding is that the goal is not necessarily to find configurations that outperform the defaults, but to enrich the comparison table by testing additional hyperparameter combinations for a...

Comparison of Different Fine-Tuning Techniques for Conversational AI

Great, thanks @yuhongwang-xd. Just a note, AdaLoRA might actually be quite hard to optimize because it is a bit special with it's different training phases. Maybe you can start with...

Comparison of Different Fine-Tuning Techniques for Conversational AI

@rp440 Are there any specific topics you're interested in? Maybe you can coordinate with @yuhongwang-xd on this. As a more general note, following [this blog post](https://thinkingmachines.ai/blog/lora/), I did some experiments...

AttributeError: 'LlamaForCausalLM' object has no attribute 'add_weighted_adapter'

Could you please share the code that results in this error?

AttributeError: 'LlamaForCausalLM' object has no attribute 'add_weighted_adapter'

I see, thanks. `add_weighted_adapter` is not supported on transformers models directly, only on PEFT models. Could you change the code like so and check if this works: ```python model =...

[FEAT] Integrate LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

Thank you for your proposal to add LoRETTA to PEFT. The paper looks interesting enough from a methodology point of view, so it could be a good fit. Before you...

[FEAT] Integrate LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

@mbaddar1 There is no need to forward the conversation. If you've got the okay from Yifan Yang, just go ahead and start the PR. > I will start a PR...

[FEAT] Integrate LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models

There is no specific tutorial for adding new PEFT methods. I think it's easiest if you check one of the past additions, e.g. #2678 is a recent one. As a...

DoRA slow forward inference

@phemw I was at a conference, hence the late reply. You are right that during inference, the weight norm of DoRA is needlessly recomputed each time. I created a draft...

DoRA slow forward inference

I tried `meta-llama/Llama-3.1-8B` and the difference is bigger, but still small overall (16%). It would be great if you could test on your use case, so that we can identify...