Tejas Shah
Tejas Shah
To reproduce easily cohere 1m dataset was used for benchmarking for the below table | number of index segments | force merge time (minutes) | force merge segments -- |...
@jmazanec15 So first row is 2.17 but on main branch as the code path is the same. To mimic 2.16 code path [this change](https://github.com/opensearch-project/k-NN/commit/589969bf95c8ebe1c78dca977e600a4bc2122ab3#diff-9dad8c6bb50a281c98d7cee29de3491e73a561f530d5e0a029d7deb860cf1698R146) was made while running the bench...
> @shatejas but isnt 2.17 time same as 2.16 - so can we not repro it with the setup? Not exactly same, the number of segments being merged is ~15%...
Thanks for the deep dive @navneet1v > How does the change of IOContext to RANDOM impacts Lucene Engine? Based on the code deep dive, IOContext is used to advise the...
> What solution we should move towards now for fixing this regression? In an attempt to fix the regression few solutions were explored ## 1. Preloading .vec and .vex files...
Changes in lucene are complete with https://github.com/apache/lucene/pull/13985. Currently lucene version 10.1 have these changes. To have lucene changes integrated in KNN plugin, Opensearch core needs to be updated to use...
https://github.com/opensearch-project/k-NN/pull/2132 https://github.com/opensearch-project/k-NN/pull/2140 - Is this related to new feature introduced in 2.17? Both are performance related 2132 - Customers will not get concurrent segment search if the setting is auto...
@neuenfeldttj Thanks for the RFC! > 2) Multi-profiler design Multi-profiler design seems to giving more flexibility at the cost of doing more heavy lifting. One of the things to consider...
CC: @jimczi, @benwtrent similar to [#12551 ](https://github.com/apache/lucene/pull/12551) except efSearch is static here.
For HNSW efSearch is a core parameters during search time. This is convenient for users to not have to have the logic to strip off top k values on their...