data-juicer
data-juicer copied to clipboard
Evalscope evaluator & MedEval evaluator for dj-sandbox
Add EvalscopeEvaluator and MedEvaluator to EvaluateModelHook in DJ-Sandbox. Implement the following features:
- Enable LLM evaluation through evalscope capabilities
- One-stop solution to launch MedEval workflow and generate corresponding radar charts
[WIP] Optimize file structure and evaluator class architecture
Please merge the latest main branch.
Please merge the latest main branch.
There may be an issue with pre-commit regarding the following three files. In my local pre-commit process, the import wandb forced below import yaml. However, in the CI import wandb got above import yaml. I have executed pre-commit clean and pre-commit install.
data_juicer/core/sandbox/pipelines.py
tools/evaluator/recorder/wandb_writer.py
tools/hpo/execute_hpo_wandb.py
Currently, I have manually adjusted it according to the CI pre-commit.
Please merge the latest main branch.
There may be an issue with pre-commit regarding the following three files. In my local pre-commit process, the
import wandbforced belowimport yaml. However, in the CIimport wandbgot aboveimport yaml. I have executedpre-commit cleanandpre-commit install.data_juicer/core/sandbox/pipelines.py tools/evaluator/recorder/wandb_writer.py tools/hpo/execute_hpo_wandb.pyCurrently, I have manually adjusted it according to the CI pre-commit.
It's because you might run something based on wandb locally and there is a wandb directory in the root of Data-Juicer. So when you run pre-commit locally, pre-commit thinks wandb is a local module, so it prefers to put wandb in the back. However, there is no such wandb directory in a clean Data-Juicer repo, so pre-commit in the CI regards wandb as a normal third-party library and put it in the front part.
The best way to keep them aligned with each other is to clear local wandb directory in time.