Jay Deng
Jay Deng
Without looking too deeply into the logic of the 2 PRs (for purposes of trying to keep the conversation generic), why can't the "overlapping" diff part of the 2 PRs...
> That's a fair point. And, one of the reason we broke the 57 file [large PR](https://github.com/opensearch-project/OpenSearch/pull/13315) into individual APIs. But at the same time, even small PRs can take...
Another major change to explore: https://github.com/apache/lucene/pull/13337
Thanks @andrross! The user flow on how to set up and configure the specific repository as well as which specific settings they can use to do so will be finalized...
@andrross I've published low level design details in https://github.com/opensearch-project/k-NN/issues/2465, feel free to take a look if you're curious!
> This helps as different remote store implementation partition based on these prefix patterns. Ref: https://github.com/opensearch-project/OpenSearch/issues/15146. I think here traffic pattern will be similar where it will upload multiple parallel...
Thanks @navneet1v, how about instead something like `index.knn.remote_build.size_threshold`? This seems in line with the existing (and functionally similar) `index.knn.advanced.approximate_threshold`. As for download part I've mentioned a few potential improvements in...
The first step here is to build an initial POC to validate the following: 1. Validate that we can wire up the native index writer with the repository service 2....
On an updated POC that does not load vectors on `InputStream.skip` I am seeing just 7019ms to upload the 2.9GB for the 1m vector dataset. Will publish full details and...
@navneet1v It's much better, the "bad" version took 37789 ms on the same. 7k ms for 2.9 GB is approximately 400 mb/s for upload (which is including all of the...