Benchmark existing techniques using evaluation harness

Open mrm1001 opened this issue 9 months ago • 0 comments

Context on benchmark work

goal number 1 is to give user practical guidance on what techniques to try out on their dataset/use case
goal number 2 is to show that there is not a “silver bullet” type of solution, that it depends on the dataset and use case, but that Haystack can support them all
goal number 3 is to showcase advanced evaluation/experimentation API (most advanced compared to competitors)
it’s not a research paper, so should not be too “academic” (i.e. not too restricted in terms of metrics or datasets to use, not meant to be peer-reviewed or submitted to an academic conference)
Datasets

### Tasks
- [x] Create a new repository that will have all the code for the benchmark study
- [ ] https://github.com/deepset-ai/haystack/issues/7628
- [ ] https://github.com/deepset-ai/haystack/issues/7629

May 02 '24 08:05 mrm1001

haystack haystack copied to clipboard

Benchmark existing techniques using evaluation harness

Context on benchmark work

haystack
haystack copied to clipboard