tantivy
tantivy copied to clipboard
Investigate the bottleneck in the search benchmark using toplev
Toplev seems very interesting to identify bottlenecks.
Let's use it and compare the results to our current champion PISA. We want different report for intersections, union, and phrase queries.
Can you develop what you expect in this issue ?
- What is toplev ? A web search didn't turn up anything obvious
- What is the bottleneck in question to be investigated ?
- What is PISA ? I see it mentioned in the changelog, is that the same thing ?
@fulmicoton can you add a link for Toplev?
@scampi : PISA project is here https://github.com/pisa-engine/pisa
Concerning bottlenecks can be IO or CPU or both.
Let's take the example of union term queries you will see that PISA is way faster in the benchmark https://tantivy-search.github.io/bench/
I believe this comes from the fact that their WAND algorithm uses a better data structure to exclude documents that will not make it to the top 10. But this will need to be confirmed for example.
@scampi for toplev, see the manual here: https://github.com/andikleen/pmu-tools/wiki/toplev-manual