No persona found with name : Documentation example doesn't work with minimal changes

Open codechanger0 opened this issue 7 months ago • 0 comments

[ x] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug The code given in Single Hop Query Testset documentation does not work.

Ragas version: 0.2.5 Python version: 3.12.9

Ollama LLM used: "smollm2:1.7b-instruct-fp16" Ollama Embeddings used for same model

Code to Reproduce Take the code in Single Hop Query Testset documentation and make the following changes:

Since I have used Ollama LLM, hence the code for initialization of generator_llm and generator_embeddings is changed to

from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_ollama.embeddings import OllamaEmbeddings
from langchain_ollama.llms import OllamaLLM

model_name="smollm2:1.7b-instruct-fp16"
embedding_fn = OllamaEmbeddings(model=model_name)
llm = OllamaLLM(
    model=model_name, temperature=0
) 
generator_llm = LangchainLLMWrapper(llm)
generator_embeddings = LangchainEmbeddingsWrapper(embedding_fn)

Then come the main change. Change the name of the personas as follows:

persona_first_time_flier = Persona(
    name="First time flight taker",                             #  <---- This line is changed
    role_description="Is flying for the first time and may feel anxious. Needs clear guidance on flight procedures, safety protocols, and what to expect throughout the journey.",
)

persona_frequent_flier = Persona(
    name="Frequently takes the flights",                        #  <---- This line is changed
    role_description="Travels regularly and values efficiency and comfort. Interested in loyalty programs, express services, and a seamless travel experience.",
)

persona_angry_business_flier = Persona(
    name="Exclusively tavels in Business Class",                #  <---- This line is changed
    role_description="Demands top-tier service and is easily irritated by any delays or issues. Expects immediate resolutions and is quick to express frustration if standards are not met.",
)

Error trace

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[16], line 1
----> 1 testset = generator.generate(testset_size=10, query_distribution=query_distibution)
      2 testset.to_pandas()

File c:\Users\codechanger0\Documents\Projects\swipesmart\.venv\Lib\site-packages\ragas\testset\synthesizers\generate.py:413, in TestsetGenerator.generate(self, testset_size, query_distribution, num_personas, run_config, batch_size, callbacks, token_usage_parser, with_debugging_logs, raise_exceptions)
    411 except Exception as e:
    412     scenario_generation_rm.on_chain_error(e)
--> 413     raise e
    414 else:
    415     scenario_generation_rm.on_chain_end(
    416         outputs={"scenario_sample_list": scenario_sample_list}
    417     )

File c:\Users\codechanger0\Documents\Projects\swipesmart\.venv\Lib\site-packages\ragas\testset\synthesizers\generate.py:410, in TestsetGenerator.generate(self, testset_size, query_distribution, num_personas, run_config, batch_size, callbacks, token_usage_parser, with_debugging_logs, raise_exceptions)
    401     exec.submit(
    402         scenario.generate_scenarios,
    403         n=splits[i],
   (...)    406         callbacks=scenario_generation_grp,
    407     )
    409 try:
--> 410     scenario_sample_list: t.List[t.List[BaseScenario]] = exec.results()
    411 except Exception as e:
    412     scenario_generation_rm.on_chain_error(e)
...
     57     if persona.name == key:
     58         return persona
---> 59 raise KeyError(f"No persona found with name '{key}'")

KeyError: "No persona found with name 'Exclusively travels in Business Class'"

Expected behavior The code should work. Essentially, we should be able to name the personas anything. LLM should not correct the names or search for unknown personas

Additional context On investigating the issue, it was found that the issue root cause is in pydantic_prompt.py file-> generate_multiple(). On this line,

resp = await llm.generate(
            prompt_value,
            n=n,
            temperature=temperature,
            stop=stop,
            callbacks=prompt_cb,
        )

the LLM is returning persona names which is different than what was given in the prompt_value variable. The persona name given was "Exclusively tavels in Business Class", but on this llm call returns "Exclusively travels in Business Class". Notice the misspelled word "travel" in the code while creating Persona object. Due to this difference, the error of KeyError: "No persona found with name 'Exclusively travels in Business Class'" is thrown.OverflowError

Note: This issue is not limited to misspellings. For example, if a persona name of "First Credit Card Applier" is given, llm generates a persona name of "First Time Credit Card Applicant".

May 18 '25 08:05 codechanger0