ValidationError with OpenAIEmbeddings in Ragas
A ValidationError occurs when trying to use the evaluate function from Ragas with the OpenAIEmbeddings model. The error message suggests using AzureOpenAIEmbeddings when using Azure, but I am using the standard OpenAI API.
To Reproduce
Steps to reproduce the behavior:
- Use the following code from the Ragas documentation: Metrics - Answer Correctness
- Run the code, and observe the error.
from datasets import Dataset
from ragas.metrics import answer_correctness
from ragas import evaluate
data_samples = {
'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset, metrics=[answer_correctness])
score.to_pandas()
Error Message
181 def embedding_factory(
182 model: str = "text-embedding-ada-002", run_config: t.Optional[RunConfig] = None
183 ) -> BaseRagasEmbeddings:
--> 184 openai_embeddings = OpenAIEmbeddings(model=model)
185 if run_config is not None:
...
343 object_setattr(__pydantic_self__, '__dict__', values)
ValidationError: 1 validation error for OpenAIEmbeddings
__root__
If you are using Azure, please use the `AzureOpenAIEmbeddings` class. (type=value_error)
Expected behavior
The code should execute without errors, generating the evaluation scores and converting them to a pandas DataFrame.
Suggested Workaround
The issue seems to be related to the LangChain framework, and a suggested workaround can be found here.
The code change is:
embeddings = OpenAIEmbeddings(openai_api_type="openai")
Environment:
- OS: MacOS
- Python version: 3.11.9
- Ragas version: 0.1.10
- langchain_openai version: 0.1.7
- openai version: 1.36.1
Hey, @SushmitaSingh96! I'm here to help you with bugs, questions, and more while we wait for a human maintainer. Let's tackle this together!
To resolve the ValidationError when using the evaluate function from Ragas with the OpenAIEmbeddings model, ensure that you are passing the correct instances of the embeddings and LLM models. Since you are using the standard OpenAI API and not Azure, you should wrap your OpenAI embeddings and LLM instances with the appropriate wrappers provided by Ragas.
Here is an example of how to do this:
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from ragas.metrics import answer_correctness
from ragas import evaluate
from datasets import Dataset
# Initialize OpenAI embeddings and LLM
openai_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
openai_llm = OpenAI(model="text-davinci-003")
# Wrap them with Ragas wrappers
wrapped_embeddings = LangchainEmbeddingsWrapper(openai_embeddings)
wrapped_llm = LangchainLLMWrapper(openai_llm)
# Prepare your data
data_samples = {
'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
'ground_truth': ['The first superbowl was held on January 15, 1967', 'The New England Patriots have won the Super Bowl a record six times']
}
dataset = Dataset.from_dict(data_samples)
# Perform evaluation
score = evaluate(dataset, metrics=[answer_correctness], llm=wrapped_llm, embeddings=wrapped_embeddings)
print(score.to_pandas())
This code ensures that the OpenAI embeddings and LLM are correctly wrapped and passed to the evaluate function, which should resolve the ValidationError [1][2].
To continue talking to Dosu, mention @dosu.
thanks for reporting this @SushmitaSingh96, seems like the fix you suggested should help but not sure why I don't see it in mine
have you set OPENAI_API_TYPE in your env ?
I encountered the same problem and tried the same code from dosubot. Even after running the exact code produced by it, I keep getting this error. Is this because of some discrepancy from my side?
pydantic_core._pydantic_core.ValidationError: 1 validation error for OpenAIEmbeddings openai_api_key Extra inputs are not permitted [type=extra_forbidden, input_value='no-key', input_type=str] For further information visit https://errors.pydantic.dev/2.10/v/extra_forbidden
hey @Nandhini25S that is strange but it seems to be coming from this line
openai_embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
ie when initing the embedding model?
@jjmachan Yes it looks like the error indeed comes from that line. I've referred to the documentation and tried to upgrade all the packages to latest version. No use still T_T