raft [WIP] Refactor raft-ann-bench: single CLI entrypoint, importable Python module, config validation, and testing

No breaking changes were made to the existing module raft-ann-bench. The proposal is for migrating functionalities of raft-ann-bench to the refactored module.

Key features of the refactoring:

Refactor raft-ann-bench.run and make it easier to write tests against. Currently the main() and run_build_and_search() functions in python/raft-ann-bench/src/raft-ann-bench/run/__main__.py are rather difficult to test, as they include quite a bit of parameter parsing and config data processing logic. The refactor would break up the functions into smaller functions.
Use Pydantic to model the 4 types of config files. This makes config data parsing more robust, as Pydantic performs validation, and it helps remove quite a bit of data parsing code. The 4 types of config files modeled: datasets.yaml as DatasetConfig, algos.yaml as AlgoLibConfig, algo benchmark config (e.g. raft_cagra.yaml) as AlgoConfig, and lastly the benchmark executable's config (e.g. the deprecated deep-image-96-inner.json) as BenchConfig.
Single CLI entrypoint. Use Typer to organize the CLI commands in a single entrypoint. This would be helpful for incrementally migrating other functionalities, such as raft-ann-bench.plot. Furthermore, documentation for the CLI can be auto-generated with typer raft_ann_bench.cli utils docs --name raft-ann-bench --output raft_ann_benchmarks_cli.md.
Decouple the ANN executable's parameters from the Python module. Preserve the naming of the ANN executable's parameters (i.e. cpp/bench/ann/src/common/benchmark.hpp), and design the CLI to merely passthrough parameters to the ANN executable (e.g. raft-ann-bench run build).
Unit tests for the new module.
Migration tests for ensuring that the new CLI and the old CLI produce the same results.
Make the Python module importable. The current module name raft-ann-bench cannot be easily imported in Python; replacing it with raft_ann_bench will work.

Closes #2153 Closes #2151 Closes #2150 Closes #2149

Feb 13 '24 21:02 fy

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Feb 13 '24 21:02 copy-pr-bot[bot]

/ok to test

Feb 14 '24 00:02 cjnolet

Benchmarks are being migrated over to cuVS so I'm closing this PR for now to reduce some noise. Please feel free to reopen if needed.

Sep 06 '24 18:09 cjnolet