autotrain-advanced icon indicating copy to clipboard operation
autotrain-advanced copied to clipboard

Add metric_for_best_model="loss" as default in interface and add note on default metric in model card

Open MoritzLaurer opened this issue 5 months ago • 12 comments

Feature Request

  1. I'd suggest adding the argument "metric_for_best_model="loss"" as a default value in the interface for hyperparameters to make it clear to people that the default is the loss and to enable people to change it easily.

  2. I'd suggest adding an explicit note in the automatically generated model card about the metric which was used to choose the final model thats uploaded to the hub. This makes sure that users with less technical background / who didn't check the logs understand that the uploaded model might actually not be the most accurate/performant model, but it's only the model with the lowest loss.

Motivation

I understand that loss is a good default value given the many different possible tasks and models. At the same time, there are many tasks (like classification) where it's important not to take loss as the metric to choose the model. I'm afraid that many users will not make the effort of looking into the logs to see that autotrain might have actually trained a better model on relevant metrics. Similar for the model cards: explicitly stating which metric was used to select this model makes sure that people are aware that autotrain might have resulted in other models with better metrics on something else than loss.

Additional Context

No response

MoritzLaurer avatar Feb 09 '24 11:02 MoritzLaurer

FYI: I just did another training run and manually specified metric_for_best_model="f1_macro" and for some reason it still selected the model with lowest loss. I'm not really sure why. Here are the training parameters I put in the UI:

{ "lr": 2e-5, "epochs": 10, "max_seq_length": 256, "metric_for_best_model": "f1_macro", "batch_size": 16, "warmup_ratio": 0.1, "gradient_accumulation": 1, "optimizer": "adamw_torch", "scheduler": "linear", "weight_decay": 0, "max_grad_norm": 1, "seed": 42, "logging_steps": -1, "auto_find_batch_size": false, "mixed_precision": "fp16", "save_total_limit": 2, "save_strategy": "epoch", "evaluation_strategy": "epoch" }

MoritzLaurer avatar Feb 09 '24 15:02 MoritzLaurer

params not available in the backend cannot be used. i can work on adding this next week :)

abhishekkrthakur avatar Feb 09 '24 15:02 abhishekkrthakur

This issue is stale because it has been open for 15 days with no activity.

github-actions[bot] avatar Mar 01 '24 15:03 github-actions[bot]

Hi @abhishekkrthakur, is there any update on this?

geegee4iee avatar Mar 05 '24 12:03 geegee4iee

Hopefully in next release

abhishekkrthakur avatar Mar 05 '24 12:03 abhishekkrthakur

This issue is stale because it has been open for 15 days with no activity.

github-actions[bot] avatar Mar 25 '24 15:03 github-actions[bot]

This issue was closed because it has been inactive for 2 days since being marked as stale.

github-actions[bot] avatar Apr 04 '24 15:04 github-actions[bot]

reopening this, but no time-pressure / immediate need from my side @abhishekkrthakur

MoritzLaurer avatar Apr 04 '24 17:04 MoritzLaurer

This issue is stale because it has been open for 15 days with no activity.

github-actions[bot] avatar Apr 26 '24 15:04 github-actions[bot]

open

abhishekkrthakur avatar Apr 26 '24 15:04 abhishekkrthakur