langchain
langchain copied to clipboard
Azure OpenAI Embedding langchain.embeddings.openai.embed_with_retry won't provide any embeddings after retries.
I have the following code:
docsearch = Chroma.from_documents(texts, embeddings,persist_directory=persist_directory)
and get the following error:
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Requests to the Embeddings_Create Operation under Azure OpenAI API version 2022-12-01 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 3 seconds. Please contact Azure support service if you would like to further increase the default rate limit.
The length of my texts
list is less than 100 and as far as I know azure has a 400 request/min limit. That means I should not receive any limitation error. Can someone explain me what is happening which results to this error?
After these retires by Langchain, it looks like embeddings are lost and not stored in the Chroma DB. Could someone please give me a hint what I'm doing wrong?
using langchain==0.0.125
Many thanks
+1
+1
any suggestion would be very appreciated!
+1
I set max_retries = 10. I am still getting "Retrying langchain.embeddings.openai.embed_with_retry" messages, but I was able to complete the index creation.
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", chunk_size=1, max_retries=10)
+10000
Any solution to fix this issue ? +1
So far as I know Azure OpenAI Embedding is different with Open AI official embedding api . it doesn't support us to use Chroma.from_documents instead we need to use Azure open ai embedding api to do it.
+1111
I tried something like this:
embeddings = OpenAIEmbeddings()
vector_store = FAISS.from_texts(texts=["example1", "example2"], embedding=embeddings)
and
vector_store = Chroma.from_texts(texts=["example1", "example2"], embedding=embeddings)
got:
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..
I'm passing a list that has a length of 2, and it is giving me RateLimitError.
Tried two versions of Langchain, 0.0.162 and 0.0.188, and both appeared with the same error.
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised APIError: The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 25115737c4fe3e6d4deef4961066ba2e in your email.) {
"error": {
"message": "The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 25115737c4fe3e6d4deef4961066ba2e in your email.)",
"type": "server_error",
"param": null,
"code": null
}
}
500 {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID 25115737c4fe3e6d4deef4961066ba2e in your email.)', 'type': 'server_error', 'param': None, 'code': None}} {'Date': 'Fri, 16 Jun 2023 01:43:24 GMT', 'Content-Type': 'application/json', 'Content-Length': '366', 'Connection': 'keep-alive', 'access-control-allow-origin': '*', 'openai-organization': 'provectus-algae-pem6gx', 'openai-processing-ms': '5602', 'openai-version': '2020-10-01', 'strict-transport-security': 'max-age=15724800; includeSubDomains', 'x-ratelimit-limit-requests': '3000', 'x-ratelimit-remaining-requests': '2999', 'x-ratelimit-reset-requests': '20ms', 'x-request-id': '25115737c4fe3e6d4deef4961066ba2e', 'CF-Cache-Status': 'DYNAMIC', 'Server': 'cloudflare', 'CF-RAY': '7d7f5c6ebacfa83e-SYD', 'alt-svc': 'h3=":443"; ma=86400'}.
Killing me, I've sent through a single request (on a paid plan) and am being rate limited on embeddings.
After a bit of digging i found this i've can suspect 2 causes:
- If you are using credits and they run out and you go on a pay-as-you-go plan with OpenAI, you may need to make a new API key
- Hitting rate limit of requests per minute. Found this notebook from OpenAI explaining ways to get around it. Haven't tested it yet but will report back if i make any headway How_to_handle_rate_limits.ipynb
Will try implement a fix on limiting the rate of request made per minute (as if the langchain community doesn't already have one somewhere)
+1
getting for FAISS.from_documents(data, embeddings)
:
Traceback (most recent call last):
File "/app/scheduler/4_generate_embeddings.py", line 52, in <module>
vectors = FAISS.from_documents(data, embeddings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/vectorstores/base.py", line 332, in from_documents
return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 517, in from_texts
embeddings = embedding.embed_documents(texts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 452, in embed_documents
return self._get_len_safe_embeddings(texts, engine=self.deployment)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 302, in _get_len_safe_embeddings
response = embed_with_retry(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 97, in embed_with_retry
return _embed_with_retry(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f
return self(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 325, in iter
raise retry_exc.reraise()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 158, in reraise
raise self.last_attempt.result()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 382, in __call__
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py", line 95, in _embed_with_retry
return embeddings.client.create(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/api_resources/embedding.py", line 33, in create
response = super().create(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 230, in request
resp, got_stream = self._interpret_response(result, stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 624, in _interpret_response
self._interpret_response_line(
File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 687, in _interpret_response_line
raise self.handle_error_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/api_requestor.py", line 337, in handle_error_response
raise error.APIError(
openai.error.APIError: Invalid response object from API: '{ "statusCode": 500, "message": "Internal server error", "activityId": "......." }' (HTTP response code was 500)
Getting the same error using Azure OpenAI with openai.api_version = "2023-05-15"
Creating my embeddings:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(chunk_size=1, openai_api_version=openai.api_version, openai_api_key=openai.api_key, openai_api_type=openai.api_type,
openai_api_base=openai.api_base, deployment="ChatGPTEmbeddings", model="text-embedding-ada-002")
Creating vector store index:
index = VectorstoreIndexCreator(
embedding = embeddings,
vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])
Receiving this error on loop, cell running for 1minute and 51 seconds.
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Requests to the Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. Operation under Azure OpenAI API version 2023-05-15 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 1 second. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit..
+1000
+1
+1
OpenAI's developer experience is pretty frustrating.
There are two possible solutions:
- You can request an increase in quotas, which has been confirmed by MS support.
- Starting from July, Azure open AI support embedding with a chunk size of 16. You can find detailed usage information at the following reference: https://m.bilibili.com/video/BV1oP411r7g6 Unfortunately, the video at 1:30 only contains Chinese content.
I also meet this problem. But when I retry later, the bug is disappear. SOS
I'm having this issue today, but not yesterday. 2023-08-08 14:56:18 INFO error_code=429 error_message='Requests to the Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms. Operation under Azure OpenAI API version 2023-05-15 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 2 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.' error_param=None error_type=None message='OpenAI API error received' stream_error=False
2023-08-08 14:56:18 WARNING Retrying langchain.embeddings.openai.embed_with_retry.
I also meet this problem. But when I retry later, the bug is disappear. SOS
Can you please give me a source code which you used ?
I also meet this problem. But when I retry later, the bug is disappear. SOS
Can you please give me a source code which you used ?
My error report is slightly different from the title. I think it is a network problem caused by the langchain library. I cannot reproduce this problem with the previous code.
error like this:
Retrying langchain.embeddings.openai.embed_with_retry.
This bug doesn't happen again and doesn't affect me. Thank you~
OpenAI API limit is big problem. But OpenAI embeddings are not the best, so it can make sense just to use free one (see https://huggingface.co/spaces/mteb/leaderboard)
Define following values in the code 👍:
openai.api_type = "azure" os.environ["OPENAI_API_TYPE"] = "azure" os.environ["OPENAI_API_KEY"] = "your api key" os.environ["OPENAI_API_BASE"] = "put yours" os.environ["OPENAI_API_VERSION"] = "2023-03-15-preview"
llm = AzureOpenAI( api_key="your api key", api_base="put yours", api_version="2023-03-15-preview", deployment_name="name of the deployment")
llm_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", chunk_size = 1)
this will definitely work with chroma,faisss db
Someone solved the issue?
@elorberb Define following values in the code 👍:
openai.api_type = "azure" os.environ["OPENAI_API_TYPE"] = "azure" os.environ["OPENAI_API_KEY"] = "your api key" os.environ["OPENAI_API_BASE"] = "put yours" os.environ["OPENAI_API_VERSION"] = "2023-03-15-preview"
llm = AzureOpenAI( api_key="your api key", api_base="put yours", api_version="2023-03-15-preview", deployment_name="name of the deployment")
llm_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", chunk_size = 1)
In order to improve the embedding performance, you can set the chunk_size to 16, but first need to update the version to "2023-07-01-preview",
os.environ["OPENAI_API_VERSION"] = "2023-07-01-preview"
and deployment name should not be forgotten,
embeddings = OpenAIEmbeddings(
deployment="your deployment name",
model="text-embedding-ada-002",
chunk_size=16
)
What actually 16 does here?
the number of tasks processed in parallel