Metric to account for entities
Note: This is just an idea, open to discussions.
Problem: For use cases(like financial, legal, health-care) where the data might contain names of people, places, organizations etc, most of the times, these entities are important and need to be present in the context for the model to understand and use them or might need to be present in the answer. There should be a metric to account for the inclusion of these entities in the context or the answer.
Proposed solution: I think of an entity_recall metric. Now this can be thought of in two ways:
- One is to calculate the fraction of entities in the context to the entities in question/query.
- Other one is to calculate the fraction of entities in the answer to the entities in the context.
Feel free to point out if such thing already exists or is already ruled out.
Otherwise, if you find this interesting, or worth adding, I am interested in working on this, already have an implementation plan in mind. @shahules786 @jjmachan
Hi @sky-2002 , this is certainly an interesting metric. Could you also help me understand how can someone use this metric to improve their RAG/ get more understanding about the performance?
@shahules786 Okay, I will try to give an example. Consider we have an LLM to answer factual questions related to stock market/finance, Now we would be using a retrieval service to provide relevant context, and thus need some evaluation metric for this system.
Note: Just for example purposes
Example 1
Query -
What is the latest financial status of Apple Inc.?
Relevant correct context -
Apple Inc. (AAPL) recently reported its financial results for the last quarter. The company's revenue increased by 20% compared to the previous quarter, reaching $120 billion. Net profit also saw a significant rise, standing at $25 billion. Apple's CEO, Tim Cook, expressed optimism about the company's performance and outlined key strategies for the upcoming fiscal year during the earnings call.
Retrieved context -
In the world of technology, Apple is a major player. The latest iPhone release has been a huge success, contributing to the company's revenue growth. Additionally, the CEO, Tim Cook, has been actively involved in various philanthropic initiatives.
Query entities - [Apple Inc]
Relevant context entities - [Apple Inc, APPL, $120 Billion, Tim Cook]
Retrieved context entities - [Apple, Tim Cook, Iphone]
So the entity_recall considering contexts is 0.5 as 2 of 4 entities in correct context are also in retrieved context
Now I understand there some vagueness in what to compare to, we have 4 things during evaluation(correct me if I am wrong here)- the query, correct context, retrieved context, and the generated text
Using these we can define a way to find the entity_recall for different scenarios, like for (query-context) for (context-answer) for (query-answer) etc. Basic idea is to account for entities too, as they matter in factual info based use cases.
This is interesting and useful @sky-2002 , do you like to work on this? I can help :)
@shahules786 Yes for sure, I would like to contribute this. I will try creating a PR with an initial working version, using some simple entity extraction model, we can then iterate on it. Or feel free to suggest if you have any workflow in mind.
Need some pointers on how to approach this. Should I reach out on discord DM?
hey sorry this fell between the cracks - tagging @shahules786 to feedback :)