aidb icon indicating copy to clipboard operation
aidb copied to clipboard

Add Marqo vector search in vector DBs

Open sky-2002 opened this issue 1 year ago • 9 comments

Author: Aakash Thatte Description: Adds marqo tensor search as an option in vector DBs (like chrome, faiss, weaviate that already exist). Note: To test this new integration, I have relied on the TastiTest class and its test method.

Open to any suggestions and looking forward for reviews.

sky-2002 avatar Dec 11 '23 14:12 sky-2002

Have you test the performance of Marqo? Here is the results of other vector databases. https://docs.google.com/spreadsheets/d/13ZJCBb2HEdWAjbnXgy55X3hSZHpFKIlsKL9D2jdHvWU/edit?usp=sharing

ttt-77 avatar Dec 11 '23 16:12 ttt-77

Have you test the performance of Marqo? Here is the results of other vector databases. https://docs.google.com/spreadsheets/d/13ZJCBb2HEdWAjbnXgy55X3hSZHpFKIlsKL9D2jdHvWU/edit?usp=sharing

Okay, for performance testing, I need some time, would love to do it. Will experiment with the values as given in the sheet, thanks for sharing it. Though I can only test on my intel i5.

sky-2002 avatar Dec 11 '23 19:12 sky-2002

Have you test the performance of Marqo? Here is the results of other vector databases. https://docs.google.com/spreadsheets/d/13ZJCBb2HEdWAjbnXgy55X3hSZHpFKIlsKL9D2jdHvWU/edit?usp=sharing

Okay, for performance testing, I need some time, would love to do it. Will experiment with the values as given in the sheet, thanks for sharing it. Though I can only test on my intel i5.

For the purposes of merging, I don't think we need to do performance tests, just correctness tests. It would still be great to have the numbers so users can decide what to use!

ddkang avatar Dec 11 '23 20:12 ddkang

Have you test the performance of Marqo? Here is the results of other vector databases. https://docs.google.com/spreadsheets/d/13ZJCBb2HEdWAjbnXgy55X3hSZHpFKIlsKL9D2jdHvWU/edit?usp=sharing

Okay, for performance testing, I need some time, would love to do it. Will experiment with the values as given in the sheet, thanks for sharing it. Though I can only test on my intel i5.

For the purposes of merging, I don't think we need to do performance tests, just correctness tests. It would still be great to have the numbers so users can decide what to use!

Okay, how to perform correctness tests? Can you give any pointers on that?

sky-2002 avatar Dec 12 '23 04:12 sky-2002

Have you test the performance of Marqo? Here is the results of other vector databases. https://docs.google.com/spreadsheets/d/13ZJCBb2HEdWAjbnXgy55X3hSZHpFKIlsKL9D2jdHvWU/edit?usp=sharing

Okay, for performance testing, I need some time, would love to do it. Will experiment with the values as given in the sheet, thanks for sharing it. Though I can only test on my intel i5.

For the purposes of merging, I don't think we need to do performance tests, just correctness tests. It would still be great to have the numbers so users can decide what to use!

Okay, how to perform correctness tests? Can you give any pointers on that?

https://github.com/ddkang/aidb/blob/3e4f43a34cbe3143b53d96358a8880c35344b136/tests/tasti_test/tasti_test.py#L137-L152 You need to add a test here, and modify the parameter as what we did in Google sheet. You don't need to run all parameters if the running time is proportional to the parameter.

ttt-77 avatar Dec 12 '23 04:12 ttt-77

Okay, I will add it, but I don't see any code to track time, did you do it manually(like noting down after every run)? Also, it will be great if you could explain me that sheet, I am a little confused looking at it. I am in the aidb slack, you can reach anytime.

sky-2002 avatar Dec 12 '23 09:12 sky-2002

You can use time.time()

ttt-77 avatar Dec 12 '23 09:12 ttt-77

How are things going? Have the results from Marqo been validated for accuracy by comparing them with those obtained from FAISS? @sky-2002

ttt-77 avatar Jan 10 '24 06:01 ttt-77

How are things going? Have the results from Marqo been validated for accuracy by comparing them with those obtained from FAISS? @sky-2002

No I have been a little busy, will get back to this in a few days maybe.

sky-2002 avatar Jan 10 '24 14:01 sky-2002