vectra icon indicating copy to clipboard operation
vectra copied to clipboard

When using the new text-embedding-3 models the scores are a lot lower

Open ReneReiterer opened this issue 1 year ago • 3 comments

Hey, when i try to use the new text-embedding-3 models for creating the embeddings and for querying, i get a lot lower scores for the same query.

with ada-2, a query could get a result score of 0.8, but with text-embedding-3 it goes below 0.5, but returns the same content. Is there a reason for this?

ReneReiterer avatar Jun 25 '24 12:06 ReneReiterer

That's a function of the embeddings model and nothing I have control over. It implies that they're generating a more diverse range of embeddings... Can you share some examples (query + text being compared to)

Stevenic avatar Jun 30 '24 23:06 Stevenic

Here is an example using the example from the readme of vectra:

with "text-embedding-ada-002":

Querying green... [0.9027890493383421] blue [0.8750171543194056] red [0.8316836924030466] apple

Querying banana... [0.9025824326098169] apple [0.8489727589250824] oranges [0.840552337334082] blue

with "text-embedding-3-small":

Querying green... [0.5587630540517711] blue [0.4586459570036867] red [0.3330212746409029] oranges

Querying banana... [0.463723740085403] apple [0.36792568686955635] oranges [0.3011467689281706] blue

with "text-embedding-3-large":

Querying green... [0.5854194924173858] red [0.5425629350657741] blue [0.3589804053636035] oranges

Querying banana... [0.4618476040380141] apple [0.39727599664880175] oranges [0.37006686089236474] blue

ReneReiterer avatar Jul 01 '24 06:07 ReneReiterer

It's just the nature of these embedding models... they're probabilistic

Stevenic avatar May 07 '25 23:05 Stevenic