feat: add faithfulness metric based on Bespoke Labs MiniCheck model
This PR adds a faithfulness metric based on the Bespoke-MiniCheck-7B model.
Users can use the metric either by calling the model through the Bespoke Labs API, or by running the model locally.
I tested that the metric works via a colab: https://colab.research.google.com/drive/1OcL8-LkeKp-_7-_8_l7ysO8O6_AIz6jd#scrollTo=Jbg0gon7uXII.
Thanks for the PR @vutrung96 , I will take a look at it shortly.
@shahules786 thanks! one thing I'm running into is that for make type, I'm getting import not found for this line:
"import einops as einops"
I think it's because in CI, the command pip install is run to install ragas, which doesn't include the optional dependencies.
@vutrung96 you can add that as part of the dev dependencis in requirements/dev.txt?
@jjmachan thanks for the suggestion! I've added the dependencies to dev/requirements.txt.
@shahules786 ping on review. please lmk if you need any clarifications :)
@vutrung96 I'm rethinking and reworking parts of faithfulness metrics. I'm also thinking best ways to allow users to use any NLI model within it without adding code into ragas. please bear with me on this.
Hi @vutrung96 , thanks again for the PR.
We want to enable developers to use any model of their choice, regardless of the metric they use. There are two types of models in this context:
- General-purpose models (e.g., OpenAI, Anthropic, LLaMA, etc.)
- Specialized models that are limited to one or more tasks (e.g., Vectara HHEM, Bespoke)
For the former, we already support the use of any model with ragas. For the latter, currently, either we or the user has to modify the code in ragas to integrate the specialized model. It is fine if the user does this in their own version of ragas, but merging that code into the main ragas repository transfers the responsibility of maintaining and updating it ( this is the case with this PR), which is not something we can take on.
Therefore, we are introducing components that allow developers to plug in their specialized models for use with metrics. This is still experimental but can be integrated into ragas after a few iterations and feedback. #1339
Here’s the revised version:
In your case, I think the model can be used as a component by passing it as a HuggingfacePipeline to LLMComponent. Please take a look at it, and perhaps we can add documentation to help users with this integration.