milvus icon indicating copy to clipboard operation
milvus copied to clipboard

[Feature]: Support HNSW SQ

Open xiaofan-luan opened this issue 1 year ago • 22 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Is your feature request related to a problem? Please describe.

SQ8 and PQ are widely used in ANN search. If you want to understand more about quantization, Faiss is probably one of the best code bases to explore.

HNSW is the fastest index in the open source world, so why not make it work together with SQ and PQ to accelerate it further?

Let me know if anyone is interested and we can offer more help on it

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

xiaofan-luan avatar Apr 04 '23 21:04 xiaofan-luan

Hi @xiaofan-luan I am interested in contributing! and would like to know how to help! let me know how to get started.

noble-8 avatar Apr 08 '23 19:04 noble-8

cool man! I though @liliu-z could offer you some help

xiaofan-luan avatar Apr 08 '23 23:04 xiaofan-luan

you have any experience on cpp and some any idea about HSNW algorithm yet?

xiaofan-luan avatar Apr 08 '23 23:04 xiaofan-luan

Hi I had taken a cpp course in college, I have primarily worked as a Java developer ~3 ish years, so i feel that I can onboard quickly. I feel that I am comfortable working on cpp. I am really new to this algorithm but getting up to speed. I am going through this documentation here:https://www.pinecone.io/learn/hnsw/ feel free to point me to other resources. I hope this is not a dealbreaker!

noble-8 avatar Apr 08 '23 23:04 noble-8

  1. you can start from read code from HNSW in knowhere https://github.com/milvus-io/knowhere/tree/c05c8767f43eaa855c13654804d0bea9cc42c7de/src/index/hnsw
  2. Once you understand hnsw, then the next step would be under stand how to do PQ, SQ. Faiss document will give your general ideas and milvus has all faiss code you can utilize https://github.com/facebookresearch/faiss/wiki.
  3. Add index parameters for milvus to support PQ, SQ, which will be a trivial task

xiaofan-luan avatar Apr 09 '23 17:04 xiaofan-luan

Hi @noble-8 this is on our roadmap and please feel free to make a PR for https://github.com/milvus-io/knowhere . I suggest we can start from SQ8 which is easier to implement. And more than welcome to open another issue in Knowhere for further detailed communication.

liliu-z avatar Apr 12 '23 02:04 liliu-z

/assign @liliu-z

liliu-z avatar Apr 12 '23 02:04 liliu-z

Sounds good. Will do!

noble-8 avatar Apr 12 '23 02:04 noble-8

@noble-8 any progress on it?

xiaofan-luan avatar Oct 03 '23 04:10 xiaofan-luan

i could not make any progress. i shall try again, however feel free to reassign this if i do not make a commit

noble-8 avatar Oct 05 '23 02:10 noble-8

i could not make any progress. i shall try again, however feel free to reassign this if i do not make a commit

Sure, still thanks for the interest! I would also like to help if you are intersted

xiaofan-luan avatar Oct 06 '23 02:10 xiaofan-luan

I just saw that https://github.com/milvus-io/knowhere is now archived. Wondering if this issue is still open ?

Or should the PR be addressed to https://github.com/zilliztech/Knowhere instead from now on ?

To summarize to make sure that I understand this correctly:

  • (1) Knowhere already implements HNSW, which runs on top of high-dimensional vectors
  • (2) The PR needs to implement SQ8 using FAISS to quantize high-dimensional vectors into compressed forms.
  • (3) Somehow, make HNSW to be able tow work with quantized/compressed vectors given at (2)

Am I correct ?

LaPetiteSouris avatar Oct 06 '23 11:10 LaPetiteSouris

https://github.com/milvus-io/knowhere

It has been archived and moved to https://github.com/zilliztech/knowhere, sorry for the misunderstanding.

You are correct my man. we want to add quantization support for HNSW index and integrate with Milvus

xiaofan-luan avatar Oct 06 '23 12:10 xiaofan-luan

Thanks. How urgent do you folks need this ? My C++ is rusty 😭 so it may take a while ( I have CoPilot so that helps 😭 ).

But I love this challenge.

LaPetiteSouris avatar Oct 06 '23 13:10 LaPetiteSouris

@xiaofan-luan ~~if you folks have patience to spare, then assign this to me~~

Edit: I tried to hack around and it seems that it's a bit too much for me to take this time. I'll pick another good first issue to ramp up.

LaPetiteSouris avatar Oct 06 '23 19:10 LaPetiteSouris

@xiaofan-luan I think the issue is not easy for beginners, it needs lots of knowledge 😅

jiaoew1991 avatar Oct 12 '23 20:10 jiaoew1991

@xiaofan-luan I think the issue is not easy for beginners, it needs lots of knowledge 😅

Agreed you might be correct.

For SQ might be ok?

xiaofan-luan avatar Oct 13 '23 00:10 xiaofan-luan

But true it has to be fully understand milvus

xiaofan-luan avatar Oct 13 '23 00:10 xiaofan-luan

remove the good first issue

xiaofan-luan avatar Oct 13 '23 00:10 xiaofan-luan

I sound that HNSW-SQ8 has been available on Ziili Cloud. Is that true?

zaobao avatar Nov 03 '23 06:11 zaobao

Zilliz cloud don't use HSNW. we have an internal index named Cardinal~

xiaofan-luan avatar Nov 03 '23 06:11 xiaofan-luan

whether milvus can use hnsw pq index now ? @xiaofan-luan

Monster880 avatar May 16 '24 06:05 Monster880

/assign @liliu-z

xiaofan-luan avatar May 18 '24 13:05 xiaofan-luan

@liliu-z do we have plan to support hnsw pq and sq index?

xiaofan-luan avatar May 18 '24 13:05 xiaofan-luan

@xiaofan-luan Can you provide some guidance on where modifications are needed to support HNSW PQ index??

Monster880 avatar May 30 '24 03:05 Monster880

NP, I thought Li @liliu-z can help on that.

xiaofan-luan avatar May 30 '24 06:05 xiaofan-luan

@liliu-z @xiaofan-luan emmm.... where is liliu-z

Monster880 avatar May 30 '24 07:05 Monster880

@liliu-z @xiaofan-luan emmm.... where is liliu-z

Sure, there are two ways to support HNSW + Quantization:

  1. Make it a new index type for Milvus
  2. Treat it as HNSW with a special config.

We are adopting the first way. So the work including:

  1. Add the quantization support in algorithm side and expose it as a new index type. Code should be in Knowhere
  2. Let Milvus know this new Index Type

Here is an example PR for the first step. It support SQ8 for HNSW in Knowhere side.

liliu-z avatar May 30 '24 07:05 liliu-z

@liliu-z it seems HNSW_PQ is not using faiss but using hnswlib after quantization...

Monster880 avatar May 30 '24 08:05 Monster880

Now we prefer to use hnswlib rather than faiss for hnsw, so we need to backport pq and sq feature

xiaofan-luan avatar May 30 '24 09:05 xiaofan-luan