autotrain-advanced Add metric_for_best_model="loss" as default in interface and add note on default metric in model card

Add metric_for_best_model="loss" as default in interface and add note on default metric in model card

Open MoritzLaurer opened this issue 5 months ago • 12 comments

Feature Request

I'd suggest adding the argument "metric_for_best_model="loss"" as a default value in the interface for hyperparameters to make it clear to people that the default is the loss and to enable people to change it easily.
I'd suggest adding an explicit note in the automatically generated model card about the metric which was used to choose the final model thats uploaded to the hub. This makes sure that users with less technical background / who didn't check the logs understand that the uploaded model might actually not be the most accurate/performant model, but it's only the model with the lowest loss.

Motivation

I understand that loss is a good default value given the many different possible tasks and models. At the same time, there are many tasks (like classification) where it's important not to take loss as the metric to choose the model. I'm afraid that many users will not make the effort of looking into the logs to see that autotrain might have actually trained a better model on relevant metrics. Similar for the model cards: explicitly stating which metric was used to select this model makes sure that people are aware that autotrain might have resulted in other models with better metrics on something else than loss.

Additional Context

No response

Feb 09 '24 11:02 MoritzLaurer

FYI: I just did another training run and manually specified metric_for_best_model="f1_macro" and for some reason it still selected the model with lowest loss. I'm not really sure why. Here are the training parameters I put in the UI:

{ "lr": 2e-5, "epochs": 10, "max_seq_length": 256, "metric_for_best_model": "f1_macro", "batch_size": 16, "warmup_ratio": 0.1, "gradient_accumulation": 1, "optimizer": "adamw_torch", "scheduler": "linear", "weight_decay": 0, "max_grad_norm": 1, "seed": 42, "logging_steps": -1, "auto_find_batch_size": false, "mixed_precision": "fp16", "save_total_limit": 2, "save_strategy": "epoch", "evaluation_strategy": "epoch" }

Feb 09 '24 15:02 MoritzLaurer

params not available in the backend cannot be used. i can work on adding this next week :)