Thomas Capelle comments

Results 169 comments of


                                            Thomas Capelle

Compute grad norm

We should also add this parameter to the recipes YAMLs, default should be 1.0 as in HF and axolotl.

Compute grad norm

I need to add the: ```yaml max_norm: 1.0 ``` to the recipes, do you have any trick to do this automatically?

I feel that we should have grad_clip enabled by default. The idea is give a good finetune recipe in place, also the grad_norm is a good debugging tool. [This](https://wandb.ai/llm_surgery/mistral_zephyr_v2/reports/Checking-Spiking-inputs--Vmlldzo3MDY1ODMx) is...

Compute grad norm

Agree with everything above. I think we should wait and test if `max_grad_norm` should be used on the recipes as default or not. I can change it now to `float(inf)`...

[BUG] MlflowException: No Gateway server uri has been set. (while using a deployments server)

Ended up stuck here while following this example: https://mlflow.org/docs/latest/llms/prompt-engineering/index.html

regression

Yeah, that's it! You will put as many outputs as variables to regress. If you have only one-dimensional regression, then `1` is it. My only take away, is that most...

regression

> Thank you for the quick response, so let's say that I am hoping to use the pre-trained timesformer model for regression instead of classification, for example, using negative Pearson...

regression

Hope this clarifies my idea:

regression

Sorry, I can't help you with this. Maybe ask on the PyTorch forums?

regression

sorry, don't know.