azure-search-openai-demo icon indicating copy to clipboard operation
azure-search-openai-demo copied to clipboard

APIError Invalid response object from API: '{ "statusCode": 401, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." }' (HTTP response code was 401))>

Open chip-davis opened this issue 2 years ago • 26 comments

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

After ingesting a large amount of documents via prep docs (100ish) I receive this error. This error occurs after being rate limited. It will usually recover successfully but after a second round of rate limiting, I run into this error. I have removed max retries and adjusted the timeout in the AzureDeveloperCliCredential call but that does not seem to help.

Any log messages given by the failure

APIError Invalid response object from API: '{ "statusCode": 401, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." }' (HTTP response code was 401))>

Expected/desired behavior

Expected behavior would be for the program to recover after rate limit and not lose credentials.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Windows 11

azd version?

run azd version and copy paste here.

azd version 1.1.0 (commit ea9cb12575734ee6a5f99c4d415c1a51d6f32d3e)

Versions

Using latest commit from repo.

Mention any other details that might be useful

Any guidance would be appreciated!

chip-davis avatar Jul 25 '23 18:07 chip-davis

I am experiencing this same issue

davidwboyd avatar Jul 26 '23 20:07 davidwboyd

me and my colleagues are experiencing similar issue

Pked01 avatar Jul 27 '23 13:07 Pked01

This is basically the same issue as https://github.com/Azure-Samples/azure-search-openai-demo/issues/431 Please see that issue for some suggestions

pamelafox avatar Jul 28 '23 13:07 pamelafox

Same for me. I try large PDF with 100+ pages

yak18t avatar Aug 02 '23 12:08 yak18t

We don't have an elegant solution yet, but I think a workaround is to figure out when your token is expiring, and put code in to regenerate the token, such as is done here:

https://github.com/Azure-Samples/azure-search-openai-demo/blob/52abf79fdf545fc29c5c9159b88a6ff9010f4ed2/app/backend/app.py#L128

pamelafox avatar Aug 02 '23 20:08 pamelafox

We don't have an elegant solution yet, but I think a workaround is to figure out when your token is expiring, and put code in to regenerate the token, such as is done here:

https://github.com/Azure-Samples/azure-search-openai-demo/blob/52abf79fdf545fc29c5c9159b88a6ff9010f4ed2/app/backend/app.py#L128

Thanks for the idea! I will try to implement this on my end.

chip-davis avatar Aug 03 '23 01:08 chip-davis

A fix has been merged for this error. The fix refreshes the token every 5 minute. Please re-open if you still encounter it during prepdocs.py.

Note that you will encounter it in production if you have more users than your TPM allows. We default the TPM to 30 in main.bicep, but you likely want to increase to the max if deploying for production.

pamelafox avatar Aug 24 '23 13:08 pamelafox

Hi there, I am still getting this issue. I am trying to upload abt 20,000 documents (each less than a page) and after 15-16 minutes or so this error pops up. This is with the demo version from December 15th, 2023

rubikron avatar Dec 16 '23 19:12 rubikron

Errors out after about an hour for me. PDFs with several pages, then timed-out! So i'm having to re-run prepdocs.py to capture my entire library I'm needing to upload (+1600 docs total).

stbere avatar Jan 24 '24 20:01 stbere

Can you share the traceback you see? I wonder if it's a different token timing out.

pamelafox avatar Jan 24 '24 23:01 pamelafox

Can you share the traceback you see? I wonder if it's a different token timing out.

Absolutely, here's what I get below. It seems to have happened hourly. I hit 'azd auth login' before re-running prepdocs.py to make sure i'm authenticated

Traceback (most recent call last):
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 256, in <module>
    loop.run_until_complete(main(file_strategy, azd_credential, args))
  File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 131, in main
    await strategy.run(search_info)
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\filestrategy.py", line 63, in run
    await search_manager.update_content(sections)
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\searchmanager.py", line 150, in update_content
    embeddings = await self.embeddings.create_embeddings(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 116, in create_embeddings
    return await self.create_embedding_batch(texts)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 87, in create_embedding_batch
    async for attempt in AsyncRetrying(
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\_asyncio.py", line 71, in __anext__
    do = self.iter(retry_state=self._retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\__init__.py", line 314, in iter
    return fut.result()
           ^^^^^^^^^^^^
  File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 94, in create_embedding_batch
    emb_response = await client.embeddings.create(model=self.open_ai_model_name, input=batch.texts)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\resources\embeddings.py", line 198, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1542, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1316, in request
    return await self._request(
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1368, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.'}

stbere avatar Jan 25 '24 16:01 stbere

Can you share the traceback you see? I wonder if it's a different token timing out.

Absolutely, here's what I get below. It seems to have happened hourly. I hit 'azd auth login' before re-running prepdocs.py to make sure i'm authenticated

Traceback (most recent call last):
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 256, in <module>
    loop.run_until_complete(main(file_strategy, azd_credential, args))
  File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\asyncio\base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocs.py", line 131, in main
    await strategy.run(search_info)
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\filestrategy.py", line 63, in run
    await search_manager.update_content(sections)
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\searchmanager.py", line 150, in update_content
    embeddings = await self.embeddings.create_embeddings(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 116, in create_embeddings
    return await self.create_embedding_batch(texts)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 87, in create_embedding_batch
    async for attempt in AsyncRetrying(
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\_asyncio.py", line 71, in __anext__
    do = self.iter(retry_state=self._retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\tenacity\__init__.py", line 314, in iter
    return fut.result()
           ^^^^^^^^^^^^
  File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\steve41320sa\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "C:\AI_Project\FlexLibraryV2\scripts\prepdocslib\embeddings.py", line 94, in create_embedding_batch
    emb_response = await client.embeddings.create(model=self.open_ai_model_name, input=batch.texts)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\resources\embeddings.py", line 198, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1542, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1316, in request
    return await self._request(
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\AI_Project\FlexLibraryV2\scripts\.venv\Lib\site-packages\openai\_base_client.py", line 1368, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.'}

Hello @stbere, were you able to resolve the error? I'm getting similar kind of error, while using AzureOpenAI Embeddings while creating a vectorstore in CosmosDB.

`vectorstore_page= AzureCosmosDBVectorSearch.from_documents( loaded_doc, azure_embeddings, collection=collection_page, index_name=INDEX_NAME, )

AuthenticationError: Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com/), or have expired.'}`

Rishabh7121999 avatar Feb 27 '24 06:02 Rishabh7121999

@pamelafox here's the traceback i'm getting. the pdf i am trying to index is 7000 pages.

Traceback (most recent call last):
  File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 89, in run
    await self._poll()
  File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 118, in _poll
    await self.update_status()
  File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 141, in update_status
    _raise_if_bad_http_status_and_method(self._pipeline_response.http_response)
  File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/base_polling.py", line 156, in _raise_if_bad_http_status_and_method
    raise BadStatus("Invalid return status {!r} for {!r} operation".format(code, response.request.method))
azure.core.polling.base_polling.BadStatus: Invalid return status 401 for 'GET' operation

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/xxx/Code/azure-search-openai-demo/./app/backend/prepdocs.py", line 494, in <module>
    loop.run_until_complete(main(ingestion_strategy, setup_index=not args.remove and not args.removeall))
  File "/opt/homebrew/Cellar/[email protected]/3.11.9/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/xxx/Code/azure-search-openai-demo/./app/backend/prepdocs.py", line 225, in main
    await strategy.run()
  File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/filestrategy.py", line 84, in run
    sections = await parse_file(file, self.file_processors, self.category, self.image_embeddings)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/filestrategy.py", line 26, in parse_file
    pages = [page async for page in processor.parser.parse(content=file.content)]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/filestrategy.py", line 26, in <listcomp>
    pages = [page async for page in processor.parser.parse(content=file.content)]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/Code/azure-search-openai-demo/app/backend/prepdocslib/pdfparser.py", line 57, in parse
    form_recognizer_results = await poller.result()
                              ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/_async_poller.py", line 179, in result
    await self.wait()
  File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/_async_poller.py", line 191, in wait
    await self._polling_method.run()
  File "/Users/xxx/Code/azure-search-openai-demo/.venv/lib/python3.11/site-packages/azure/core/polling/async_base_polling.py", line 93, in run
    raise HttpResponseError(response=self._pipeline_response.http_response, error=err) from err
azure.core.exceptions.HttpResponseError: (None) Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.
Code: None
Message: Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired.

mikedizon avatar May 11 '24 14:05 mikedizon