llama_index
llama_index copied to clipboard
[Bug]: `Azure AI Search` key error when using `AzureOpenAI` (not `openAI`)
Bug Description
Azure AI Search
key error when using AzureOpenAI
(not openAI
- which is working well).
This is the case where Azure AI Search
sends the text to AzureOpenAI
foe embedding, which is the typical case when hosting the solution on Azure.
Version
latest version
Steps to Reproduce
Run the notebook with AzureOpenAI key https://docs.llamaindex.ai/en/stable/examples/vector_stores/CognitiveSearchIndexDemo.html
Relevant Logs/Tracbacks
WARNING:llama_index.llms.openai_utils:Retrying llama_index.embeddings.openai.get_embeddings in 0.252567122539999 seconds as it raised AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: e06832d6********************a91a. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}.
WARNING:llama_index.llms.openai_utils:Retrying llama_index.embeddings.openai.get_embeddings in 0.01923334413891431 seconds as it raised AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: e06832d6********************a91a. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}.
WARNING:llama_index.llms.openai_utils:Retrying llama_index.embeddings.openai.get_embeddings in 2.6401476568198516 seconds as it raised AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: e06832d6********************a91a. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}.
WARNING:llama_index.llms.openai_utils:Retrying llama_index.embeddings.openai.get_embeddings in 2.13111231183456 seconds as it raised AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: e06832d6********************a91a. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}.
WARNING:llama_index.llms.openai_utils:Retrying llama_index.embeddings.openai.get_embeddings in 0.002929235961280696 seconds as it raised AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: e06832d6********************a91a. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}.
---------------------------------------------------------------------------
AuthenticationError Traceback (most recent call last)
<ipython-input-7-973c7c481efc> in <cell line: 17>()
15 storage_context = StorageContext.from_defaults(vector_store=vector_store)
16 service_context = ServiceContext.from_defaults(embed_model=embed_model)
---> 17 index = VectorStoreIndex.from_documents(
18 documents, storage_context=storage_context, service_context=service_context
19 )
21 frames
/usr/local/lib/python3.10/dist-packages/llama_index/indices/base.py in from_documents(cls, documents, storage_context, service_context, show_progress, **kwargs)
110 )
111
--> 112 return cls(
113 nodes=nodes,
114 storage_context=storage_context,
/usr/local/lib/python3.10/dist-packages/llama_index/indices/vector_store/base.py in __init__(self, nodes, objects, index_struct, service_context, storage_context, use_async, store_nodes_override, insert_batch_size, show_progress, **kwargs)
51 self._store_nodes_override = store_nodes_override
52 self._insert_batch_size = insert_batch_size
---> 53 super().__init__(
54 nodes=nodes,
55 index_struct=index_struct,
/usr/local/lib/python3.10/dist-packages/llama_index/indices/base.py in __init__(self, nodes, objects, index_struct, storage_context, service_context, show_progress, **kwargs)
73 if index_struct is None:
74 nodes = nodes or []
---> 75 index_struct = self.build_index_from_nodes(
76 nodes + objects # type: ignore
77 )
/usr/local/lib/python3.10/dist-packages/llama_index/indices/vector_store/base.py in build_index_from_nodes(self, nodes, **insert_kwargs)
272 )
273
--> 274 return self._build_index_from_nodes(nodes, **insert_kwargs)
275
276 def _insert(self, nodes: Sequence[BaseNode], **insert_kwargs: Any) -> None:
/usr/local/lib/python3.10/dist-packages/llama_index/indices/vector_store/base.py in _build_index_from_nodes(self, nodes, **insert_kwargs)
244 run_async_tasks(tasks)
245 else:
--> 246 self._add_nodes_to_index(
247 index_struct,
248 nodes,
/usr/local/lib/python3.10/dist-packages/llama_index/indices/vector_store/base.py in _add_nodes_to_index(self, index_struct, nodes, show_progress, **insert_kwargs)
197
198 for nodes_batch in iter_batch(nodes, self._insert_batch_size):
--> 199 nodes_batch = self._get_node_with_embedding(nodes_batch, show_progress)
200 new_ids = self._vector_store.add(nodes_batch, **insert_kwargs)
201
/usr/local/lib/python3.10/dist-packages/llama_index/indices/vector_store/base.py in _get_node_with_embedding(self, nodes, show_progress)
105
106 """
--> 107 id_to_embed_map = embed_nodes(
108 nodes, self._service_context.embed_model, show_progress=show_progress
109 )
/usr/local/lib/python3.10/dist-packages/llama_index/indices/utils.py in embed_nodes(nodes, embed_model, show_progress)
135 id_to_embed_map[node.node_id] = node.embedding
136
--> 137 new_embeddings = embed_model.get_text_embedding_batch(
138 texts_to_embed, show_progress=show_progress
139 )
/usr/local/lib/python3.10/dist-packages/llama_index/core/embeddings/base.py in get_text_embedding_batch(self, texts, show_progress, **kwargs)
254 payload={EventPayload.SERIALIZED: self.to_dict()},
255 ) as event:
--> 256 embeddings = self._get_text_embeddings(cur_batch)
257 result_embeddings.extend(embeddings)
258 event.on_end(
/usr/local/lib/python3.10/dist-packages/llama_index/embeddings/openai.py in _get_text_embeddings(self, texts)
384 """
385 client = self._get_client()
--> 386 return get_embeddings(
387 client,
388 texts,
/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py in wrapped_f(*args, **kw)
287 @functools.wraps(f)
288 def wrapped_f(*args: t.Any, **kw: t.Any) -> t.Any:
--> 289 return self(f, *args, **kw)
290
291 def retry_with(*args: t.Any, **kwargs: t.Any) -> WrappedFn:
/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py in __call__(self, fn, *args, **kwargs)
377 retry_state = RetryCallState(retry_object=self, fn=fn, args=args, kwargs=kwargs)
378 while True:
--> 379 do = self.iter(retry_state=retry_state)
380 if isinstance(do, DoAttempt):
381 try:
/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py in iter(self, retry_state)
323 retry_exc = self.retry_error_cls(fut)
324 if self.reraise:
--> 325 raise retry_exc.reraise()
326 raise retry_exc from fut.exception()
327
/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py in reraise(self)
156 def reraise(self) -> t.NoReturn:
157 if self.last_attempt.failed:
--> 158 raise self.last_attempt.result()
159 raise self
160
/usr/lib/python3.10/concurrent/futures/_base.py in result(self, timeout)
449 raise CancelledError()
450 elif self._state == FINISHED:
--> 451 return self.__get_result()
452
453 self._condition.wait(timeout)
/usr/lib/python3.10/concurrent/futures/_base.py in __get_result(self)
401 if self._exception:
402 try:
--> 403 raise self._exception
404 finally:
405 # Break a reference cycle with the exception in self._exception
/usr/local/lib/python3.10/dist-packages/tenacity/__init__.py in __call__(self, fn, *args, **kwargs)
380 if isinstance(do, DoAttempt):
381 try:
--> 382 result = fn(*args, **kwargs)
383 except BaseException: # noqa: B902
384 retry_state.set_exception(sys.exc_info()) # type: ignore[arg-type]
/usr/local/lib/python3.10/dist-packages/llama_index/embeddings/openai.py in get_embeddings(client, list_of_text, engine, **kwargs)
160 list_of_text = [text.replace("\n", " ") for text in list_of_text]
161
--> 162 data = client.embeddings.create(input=list_of_text, model=engine, **kwargs).data
163 return [d.embedding for d in data]
164
/usr/local/lib/python3.10/dist-packages/openai/resources/embeddings.py in create(self, input, model, encoding_format, user, extra_headers, extra_query, extra_body, timeout)
106 return obj
107
--> 108 return self._post(
109 "/embeddings",
110 body=maybe_transform(params, embedding_create_params.EmbeddingCreateParams),
/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in post(self, path, cast_to, body, options, files, stream, stream_cls)
1177 method="post", url=path, json_data=body, files=to_httpx_files(files), **options
1178 )
-> 1179 return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
1180
1181 def patch(
/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in request(self, cast_to, options, remaining_retries, stream, stream_cls)
866 stream_cls: type[_StreamT] | None = None,
867 ) -> ResponseT | _StreamT:
--> 868 return self._request(
869 cast_to=cast_to,
870 options=options,
/usr/local/lib/python3.10/dist-packages/openai/_base_client.py in _request(self, cast_to, options, remaining_retries, stream, stream_cls)
957
958 log.debug("Re-raising status error")
--> 959 raise self._make_status_error_from_response(err.response) from None
960
961 return self._process_response(
AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: e06832d6********************a91a. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}
🤖
Hello @trungtly! I'm Dosu, a friendly bot here to help you with your LlamaIndex issues while we wait for a human maintainer. I'm here to assist with bug solving, answer queries, and guide you on your journey as a contributor. I'm currently looking into your issue and I'll get back to you with a thorough response as soon as possible. Stay tuned!
you probably need to set your embed_model as AzureOpenAIEmbedding(api_key = **) and pass to service_context
That is what I did but it still raises the same error. Below is the full code for replication.
# set up Azure Cognitive Search
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
import openai
# Index name to use
index_name = "test-index"
# Use index client to demonstrate creating an index
index_client = SearchIndexClient(
endpoint=settings["AZUREAI_SEARCH_ENDPOINT"],
credential=AzureKeyCredential(settings["AZUREAI_SEARCH_KEY"]),
)
# Use search client to demonstration using existing index
search_client = SearchClient(
endpoint=settings["AZUREAI_SEARCH_ENDPOINT"],
index_name=index_name,
credential=AzureKeyCredential(settings["AZUREAI_SEARCH_KEY"]),
)
# create index
from azure.search.documents import SearchClient
from llama_index.vector_stores import CognitiveSearchVectorStore
from llama_index.vector_stores.cogsearch import (
IndexManagement,
MetadataIndexFieldType,
CognitiveSearchVectorStore,
)
# Example of a complex mapping, metadata field 'theme' is mapped to a differently name index field 'topic' with its type explicitly set
metadata_fields = {
"author": "author",
}
# A simplified metadata specification is available if all metadata and index fields are similarly named
# metadata_fields = {"author", "theme", "director"}
vector_store = CognitiveSearchVectorStore(
search_or_index_client=index_client,
index_name=index_name,
filterable_metadata_field_keys=metadata_fields,
index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
id_field_key="id",
chunk_field_key="content",
embedding_field_key="embedding",
metadata_string_field_key="li_jsonMetadata",
doc_id_field_key="li_doc_id",
)
# define embedding function
from llama_index.embeddings import AzureOpenAIEmbedding
from llama_index import (
SimpleDirectoryReader,
StorageContext,
ServiceContext,
VectorStoreIndex,
)
openai.api_key = "sk-..." # OpenAI key will work
#openai.api_key = settings["OPENAI_API_KEY_EMBEDDING"] # Azure OpenAI key will NOT work
openai.api_type = "azure"
embed_model = AzureOpenAIEmbedding(
model="text-embedding-ada-002",
deployment_name="text-embedding-ada-002",
api_key=settings["OPENAI_API_KEY_EMBEDDING"],
azure_endpoint=settings["OPENAI_API_BASE_EMBEDDING"],
api_version=settings["OPENAI_API_VERSION"],
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
service_context = ServiceContext.from_defaults(embed_model=embed_model)
# load documents
documents = SimpleDirectoryReader("./test_data/").load_data()
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context, service_context=service_context)
🤖
Based on the error message you're receiving, it seems like the issue is related to the API key you're using for AzureOpenAI. The error message 'Incorrect API key provided: e06832d6********************a91a' indicates that the key you're using is not being accepted by AzureOpenAI.
The AuthenticationError
with error code 401 usually indicates an issue with the API key or endpoint. In the context of AzureOpenAIEmbedding
, the error could be due to the following reasons:
-
The
api_key
is not correctly set. This key is used for authenticating your requests to Azure AI. Make sure you have the correct key and it's properly passed to theAzureOpenAIEmbedding
constructor. -
The
azure_endpoint
is not correctly set. This is the endpoint of your Azure OpenAI service. It should look likehttps://YOUR_RESOURCE_NAME.openai.azure.com/
. Make sure you have the correct endpoint and it's properly passed to theAzureOpenAIEmbedding
constructor. -
The
api_version
is not correctly set. This is the version of the Azure OpenAI API you are using. Make sure you have the correct version and it's properly passed to theAzureOpenAIEmbedding
constructor.
The validate_env
method in the AzureOpenAIEmbedding
class checks for these values and raises a ValueError
if they are not set. Make sure you have these values correctly set in your environment or passed to the AzureOpenAIEmbedding
constructor.
Here is the relevant code:
def validate_env(cls, values: Dict[str, Any]) -> Dict[str, Any]:
"""Validate necessary credentials are set."""
if (
values["api_base"] == "https://api.openai.com/v1"
and values["azure_endpoint"] is None
):
raise ValueError(
"You must set OPENAI_API_BASE to your Azure endpoint. "
"It should look like https://YOUR_RESOURCE_NAME.openai.azure.com/"
)
if values["api_version"] is None:
raise ValueError("You must set OPENAI_API_VERSION for Azure OpenAI.")
return values
If you have checked all these and still getting the error, it might be an issue with the Azure service itself or the network connection. You can find this code in the llama_index/embeddings/azure_openai.py file.
As for your second question, the LlamaIndex codebase does not seem to include any specific instructions or methods for handling API key errors when using AzureOpenAI. However, the validate_env
method mentioned above is designed to check the necessary credentials and raise an error if they are not set correctly. This can help in identifying and resolving issues related to the API key and other credentials.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Based on the error message you're receiving, it seems like the issue is related to the API key you're using for AzureOpenAI. The error message 'Incorrect API key provided: e06832d6********************a91a' indicates that the key you're using is not being accepted by AzureOpenAI.
The AuthenticationError
with error code 401 usually indicates an issue with the API key or endpoint. In the context of AzureOpenAIEmbedding
, the error could be due to the following reasons:
-
The
api_key
is not correctly set. This key is used for authenticating your requests to Azure AI. Make sure you have the correct key and it's properly passed to theAzureOpenAIEmbedding
constructor. -
The
azure_endpoint
is not correctly set. This is the endpoint of your Azure OpenAI service. It should look likehttps://YOUR_RESOURCE_NAME.openai.azure.com/
. Make sure you have the correct endpoint and it's properly passed to theAzureOpenAIEmbedding
constructor. -
The
api_version
is not correctly set. This is the version of the Azure OpenAI API you are using. Make sure you have the correct version and it's properly passed to theAzureOpenAIEmbedding
constructor.
The validate_env
method in the AzureOpenAIEmbedding
class checks for these values and raises a ValueError
if they are not set. Make sure you have these values correctly set in your environment or passed to the AzureOpenAIEmbedding
constructor.
Here is the relevant code:
def validate_env(cls, values: Dict[str, Any]) -> Dict[str, Any]:
"""Validate necessary credentials are set."""
if (
values["api_base"] == "https://api.openai.com/v1"
and values["azure_endpoint"] is None
):
raise ValueError(
"You must set OPENAI_API_BASE to your Azure endpoint. "
"It should look like https://YOUR_RESOURCE_NAME.openai.azure.com/"
)
if values["api_version"] is None:
raise ValueError("You must set OPENAI_API_VERSION for Azure OpenAI.")
return values
If you have checked all these and still getting the error, it might be an issue with the Azure service itself or the network connection. You can find this code in the llama_index/embeddings/azure_openai.py file.
As for your second question, the LlamaIndex codebase does not seem to include any specific instructions or methods for handling API key errors when using AzureOpenAI. However, the validate_env
method mentioned above is designed to check the necessary credentials and raise an error if they are not set correctly. This can help in identifying and resolving issues related to the API key and other credentials.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.