azure-search-openai-demo
azure-search-openai-demo copied to clipboard
APIError Invalid response object from API: '{ "statusCode": 401, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." }' (HTTP response code was 401))>
This issue is for a: (mark with an x)
- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Minimal steps to reproduce
After ingesting a large amount of documents via prep docs (100ish) I receive this error. This error occurs after being rate limited. It will usually recover successfully but after a second round of rate limiting, I run into this error. I have removed max retries and adjusted the timeout in the AzureDeveloperCliCredential call but that does not seem to help.
Any log messages given by the failure
APIError Invalid response object from API: '{ "statusCode": 401, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." }' (HTTP response code was 401))>
Expected/desired behavior
Expected behavior would be for the program to recover after rate limit and not lose credentials.
OS and Version?
Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Windows 11
azd version?
run
azd versionand copy paste here.
azd version 1.1.0 (commit ea9cb12575734ee6a5f99c4d415c1a51d6f32d3e)
Versions
Using latest commit from repo.
Mention any other details that might be useful
Any guidance would be appreciated!
I am experiencing this same issue
me and my colleagues are experiencing similar issue
This is basically the same issue as https://github.com/Azure-Samples/azure-search-openai-demo/issues/431 Please see that issue for some suggestions
Same for me. I try large PDF with 100+ pages
We don't have an elegant solution yet, but I think a workaround is to figure out when your token is expiring, and put code in to regenerate the token, such as is done here:
https://github.com/Azure-Samples/azure-search-openai-demo/blob/52abf79fdf545fc29c5c9159b88a6ff9010f4ed2/app/backend/app.py#L128
We don't have an elegant solution yet, but I think a workaround is to figure out when your token is expiring, and put code in to regenerate the token, such as is done here:
https://github.com/Azure-Samples/azure-search-openai-demo/blob/52abf79fdf545fc29c5c9159b88a6ff9010f4ed2/app/backend/app.py#L128
Thanks for the idea! I will try to implement this on my end.
A fix has been merged for this error. The fix refreshes the token every 5 minute. Please re-open if you still encounter it during prepdocs.py.
Note that you will encounter it in production if you have more users than your TPM allows. We default the TPM to 30 in main.bicep, but you likely want to increase to the max if deploying for production.
Hi there, I am still getting this issue. I am trying to upload abt 20,000 documents (each less than a page) and after 15-16 minutes or so this error pops up. This is with the demo version from December 15th, 2023
Errors out after about an hour for me. PDFs with several pages, then timed-out! So i'm having to re-run prepdocs.py to capture my entire library I'm needing to upload (+1600 docs total).
Can you share the traceback you see? I wonder if it's a different token timing out.
Can you share the traceback you see? I wonder if it's a different token timing out.
Absolutely, here's what I get below. It seems to have happened hourly. I hit 'azd auth login' before re-running prepdocs.py to make sure i'm authenticated
Traceback (most recent call last):
File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 256, in <module>
loop.run_until_complete(main(file_strategy, azd_credential, args))
File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 131, in main
await strategy.run(search_info)
File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\filestrategy.py", line 63, in run
await search_manager.update_content(sections)
File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\searchmanager.py", line 150, in update_content
embeddings = await self.embeddings.create_embeddings(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 116, in create_embeddings
return await self.create_embedding_batch(texts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 87, in create_embedding_batch
async for attempt in AsyncRetrying(
File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\_asyncio.py", line 71, in __anext__
do = self.iter(retry_state=self._retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\__init__.py", line 314, in iter
return fut.result()
^^^^^^^^^^^^
File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 401, in __get_result
raise self._exception
File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 94, in create_embedding_batch
emb_response = await client.embeddings.create(model=self.open_ai_model_name, input=batch.texts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\resources\embeddings.py", line 198, in create
return await self._post(
^^^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1542, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1316, in request
return await self._request(
^^^^^^^^^^^^^^^^^^^^
File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1368, in _request
raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.'}
Can you share the traceback you see? I wonder if it's a different token timing out.
Absolutely, here's what I get below. It seems to have happened hourly. I hit 'azd auth login' before re-running prepdocs.py to make sure i'm authenticated
Traceback (most recent call last): File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 256, in <module> loop.run_until_complete(main(file_strategy, azd_credential, args)) File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 131, in main await strategy.run(search_info) File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\filestrategy.py", line 63, in run await search_manager.update_content(sections) File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\searchmanager.py", line 150, in update_content embeddings = await self.embeddings.create_embeddings( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 116, in create_embeddings return await self.create_embedding_batch(texts) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 87, in create_embedding_batch async for attempt in AsyncRetrying( File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\_asyncio.py", line 71, in __anext__ do = self.iter(retry_state=self._retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\__init__.py", line 314, in iter return fut.result() ^^^^^^^^^^^^ File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 401, in __get_result raise self._exception File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 94, in create_embedding_batch emb_response = await client.embeddings.create(model=self.open_ai_model_name, input=batch.texts) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\resources\embeddings.py", line 198, in create return await self._post( ^^^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1542, in post return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1316, in request return await self._request( ^^^^^^^^^^^^^^^^^^^^ File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1368, in _request raise self._make_status_error_from_response(err.response) from None openai.AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.'}
Hello @stbere, were you able to resolve the error? I'm getting similar kind of error, while using AzureOpenAI Embeddings while creating a vectorstore in CosmosDB.
`vectorstore_page= AzureCosmosDBVectorSearch.from_documents( loaded_doc, azure_embeddings, collection=collection_page, index_name=INDEX_NAME, )
AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com/), or have expired.'}`
@pamelafox here's the traceback i'm getting. the pdf i am trying to index is 7000 pages.
Traceback (most recent call last):
File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 89, in run
await self._poll()
File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 118, in _poll
await self.update_status()
File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 141, in update_status
_raise_if_bad_http_status_and_method(self._pipeline_response.http_response)
File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/base_polling.py", line 156, in _raise_if_bad_http_status_and_method
raise BadStatus("Invalid return status {!r} for {!r} operation".format(code, response.request.method))
azure.core.polling.base_polling.BadStatus: Invalid return status 401 for 'GET' operation
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/xxx/Code/azure-search-openai-demo/./app/backend/prepdocs.py", line 494, in <module>
loop.run_until_complete(main(ingestion_strategy, setup_index=not args.remove and not args.removeall))
File "/opt/homebrew/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/Users/xxx/Code/azure-search-openai-demo/./app/backend/prepdocs.py", line 225, in main
await strategy.run()
File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/filestrategy.py", line 84, in run
sections = await parse_file(file, self.file_processors, self.category, self.image_embeddings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/filestrategy.py", line 26, in parse_file
pages = [page async for page in processor.parser.parse(content=file.content)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/filestrategy.py", line 26, in <listcomp>
pages = [page async for page in processor.parser.parse(content=file.content)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/pdfparser.py", line 57, in parse
form_recognizer_results = await poller.result()
^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/_async_poller.py", line 179, in result
await self.wait()
File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/_async_poller.py", line 191, in wait
await self._polling_method.run()
File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 93, in run
raise HttpResponseError(response=self._pipeline_response.http_response, error=err) from err
azure.core.exceptions.HttpResponseError: (None) Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.
Code: None
Message: Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.