colpali
colpali copied to clipboard
Use vidore benchmark to monitor performances during training
Code to be able to monitor real retrieving metrics on datasets (e.g ViDoRe benchmark) during training.
This feature is deactivated by default and is designed for power users.
To use, simply add in your training config :
vidore_eval_frequency: 200 #frequency of the benchmark eval
eval_dataset_format: "qa" #format of the benchmark datasets (qa or beir)
An example can be found at scripts/configs/qwen2/train_colqwen2_model_eval_vidore.yaml
Recap from our conversation 👋🏼
Let's:
- remove the legacy evaluation code
- add optional training arg
run_vidore_evalutor: ifFalse, do not add the custom callback - add optional training args for
vidore_eval_dataset_nameandvidore_eval_collection_name(if both are fed, raise error) - add optional training arg to control how often the eval will run (e.g. once every 5 eval steps).
@QuentinJGMace vidore-benchmark v5.0.0 has been released, don't forget to bump this dep in pyprojetct.toml 😉
@QuentinJGMace @tonywu71 updates ?