colpali icon indicating copy to clipboard operation
colpali copied to clipboard

Use vidore benchmark to monitor performances during training

Open QuentinJGMace opened this issue 10 months ago • 3 comments

Code to be able to monitor real retrieving metrics on datasets (e.g ViDoRe benchmark) during training.

This feature is deactivated by default and is designed for power users.

To use, simply add in your training config :

vidore_eval_frequency: 200 #frequency of the benchmark eval
eval_dataset_format: "qa" #format of the benchmark datasets (qa or beir)

An example can be found at scripts/configs/qwen2/train_colqwen2_model_eval_vidore.yaml

QuentinJGMace avatar Feb 14 '25 15:02 QuentinJGMace

Recap from our conversation 👋🏼

Let's:

  • remove the legacy evaluation code
  • add optional training arg run_vidore_evalutor: if False, do not add the custom callback
  • add optional training args for vidore_eval_dataset_name and vidore_eval_collection_name (if both are fed, raise error)
  • add optional training arg to control how often the eval will run (e.g. once every 5 eval steps).

tonywu71 avatar Feb 17 '25 13:02 tonywu71

@QuentinJGMace vidore-benchmark v5.0.0 has been released, don't forget to bump this dep in pyprojetct.toml 😉

tonywu71 avatar Feb 19 '25 10:02 tonywu71

@QuentinJGMace @tonywu71 updates ?

ManuelFay avatar Apr 02 '25 09:04 ManuelFay