[FEA] CAGRA-Q to quantize before graph build
Currently CAGRA-Q does quantization after building the graph. This can certainly improve search performance. But it could be even better to improve build performance if distance computations while building a graph with NN Descent can use lookup tables. The dataset can be quantized before building the graph. Furthermore, it appears that as it stands, if a user uses IVF-PQ as the graph building algorithm with CAGRA-Q, the dataset is quantized twice. I am assuming that some parts from of the product quantized vectors can be reused.
Indeed, when we use IVF-PQ build method for the KNN graph, then we do a PQ quantization for the graph building, and another one for compressing the data for CAGRA search. There are a few reasons why we do that:
- Graph building has stronger compression. It has separate codebooks for each subspace or cluster.
- CAGRA-Q uses only a single codebook so that it fits shared memory
- The memory layout of the compressed dataset is different.
Because of this, I think it is justified to have separate product quantization steps for the build algo and the CAGRA-Q encoding. Before the product quantization, we run vector quantization: cluster the dataset, assign cluster centers to each vector. (Only the difference between cluster center and database vector will be product quantized). The vector quantization step could work with same parameters in both case, therefore we could reuse the clustering at these two stages.