workers-sdk icon indicating copy to clipboard operation
workers-sdk copied to clipboard

🐛 BUG: Vectorize index returning incorrect Cosine Similarity scores outside expected range

Open thipperz opened this issue 1 year ago • 1 comments

Which Cloudflare product(s) does this pertain to?

Workers Runtime

What version(s) of the tool(s) are you using?

3.72.0

What version of Node are you using?

v20.11.0

What operating system and version are you using?

WSL

Describe the Bug

Observed behavior

After creating a cosine similarity index with 1536 dimensions, the initial queries return expected values. However, after inserting additional vectors (20,000 in my case), the queries start to return scores outside the expected range, with some values as high as 28.360535. This is breaking search functionality in my app.

Expected behavior

I expected scores to return values within the expected range of -1 to 1.

Please provide a link to a minimal reproduction

No response

Please provide any relevant error logs

No response

thipperz avatar Aug 21 '24 21:08 thipperz

Hi! I managed to reproduce, this is indeed a bug. We are going to fix this. Thanks you

netgusto avatar Aug 23 '24 16:08 netgusto

A fix for this has been rolled out.

netgusto avatar Sep 10 '24 15:09 netgusto

Thank you!

thipperz avatar Sep 10 '24 20:09 thipperz