Simon Jégou comments

Results 34 comments of


                                            Simon Jégou

Interpolation between 2 faces in dlatent space not as meaningful as it is in qlatent space

Here it is ! It's quite quick and dirty as I reload Gs every time I generate a new batch. But time does not really matters here as it converge...

Interpolation between 2 faces in dlatent space not as meaningful as it is in qlatent space

Don't have much time to work on this project but it's great tok ow you had some progress ! To answer a previous I noticed that face recovered using gradient...

Improving initialization

@Quasimondo spotted a mistake in my code : in the `finetune_18` function, the `w_mix` argument is missing in the `get_batch` call of the training phase. So the function does nothing...

Improving initialization

Great job @rolux !! @pbaylies, as you studied this encoder question in depth. What are your main feedbacks ? Does EfficientNet bring additional precision for initialization ? What is the...

@tridao for more context, I recently published a post on the current Kaggle LLM science exam competition ([here](https://www.kaggle.com/competitions/kaggle-llm-science-exam/discussion/440620)) showing that it's possible to run a 70B model on a single...

Improving generalization of LoRA with wise-ft

Hello @BenjaminBossan , Thanks for your quick answer. About DoRA, $W_{ft} = m_{dora} * (W_{base} + W_{delta})$ and $W_{wise} = (1-\alpha) * W_{base} + \alpha * W_{ft} $ so $W_{wise}...

Improving generalization of LoRA with wise-ft

I'm confident it will work reasonably well with DoRA too as for $\alpha=0$ and $\alpha=1$ it returns the right results. However I do not have any experimental data to prove...

Improving generalization of LoRA with wise-ft

I might work on it but I don't have immediate bandwidth Le mar. 23 juil. 2024, 11:42, Benjamin Bossan ***@***.***> a écrit : > Okay, too bad :-] Still I...

Improving generalization of LoRA with wise-ft

Hello, Many thanks to @ariG23498 for working on this feature and @BenjaminBossan for reviewing it. I used the code at the end of this message for some sanity checks: -...

Improving generalization of LoRA with wise-ft

@BenjaminBossan I'm running it on a macbook with `device="mps"`. The issue I get is simply: ```bash assert torch.allclose(outputs["merged"], outputs["scale 1.0"]) AssertionError ``` When I get this error, the plot shows...