openai-cookbook How come I get two different results when calling the embeddings API in parallel using the `examples/api_request_parallel

How come I get two different results when calling the embeddings API in parallel using the `examples/api_request_parallel_processor.py` script?

Open luofuli opened this issue 2 years ago • 1 comments

trafficstars

I've noticed that running the examples/api_request_parallel_processor.py script twice with the same input file produces non-identical results. Only approximately half of the text embeddings show slight differences in their decimal places, with errors ranging from 1e-4 to 1e-8 in each dimension. What could be causing these discrepancies? Could it be due to bugs in the examples/api_request_parallel_processor.py script, parallel request sending, or the embedding model's inherent randomness, such as dropout?

PS: The embedding model is text-embedding-ada-002.

Mar 15 '23 10:03 luofuli

Additionally, I would like to confirm the following questions: For a specified model (e.g., text-embedding-ada-002):

Does the embedding for the same text retrieved through the API change over time?
Does the meaning of each dimension in the embedding retrieved at different times?

Mar 15 '23 11:03 luofuli

Inherent randomness sounds like the most likely culprit. Those sound like small differences.

Nothing is intended to change over time.

Mar 17 '23 00:03 ted-at-openai

openai-cookbook openai-cookbook copied to clipboard

How come I get two different results when calling the embeddings API in parallel using the `examples/api_request_parallel_processor.py` script?

openai-cookbook
openai-cookbook copied to clipboard