lkpy
lkpy copied to clipboard
Write new Experiment or ExperimentAnalysis code module
Right now RecListAnalysis is good but limited — only computes per-user metrics.
It would help standardization of evaluation procedures if we had a more coherent "analyze" (and maybe "run") tool for experiments. The first version, of course, would just be for analysis.
- Specify experiment axes instead of inferring them?
- Support global metrics
- Specify list lengths as analysis parameter
- Support metrics with additional data (novelty, etc.)
- Clean up metric interface design
- Support analysis (sig tests, CIs, distributions, etc.)
- Support results in DuckDB?
This ticket is really probably its own epic.