Kalyan Chakravarthy Thadaka
Kalyan Chakravarthy Thadaka
This implementation involves comparing the ground truth vs. expected result and the ground truth vs. actual result, where the actual result is derived from a perturbed version of the original...
### Feature request Implement the new feature to support a pipeline that can take both an image and text as inputs, and produce a text output. This would be particularly...
This pull request introduces several changes to the `langtest` module, primarily focusing on enhancing functionality and improving code structure. The most important changes include the addition of dialogue-related columns, the...
**Current State** The `langtest` repository currently uses `pydantic.v1.BaseModel` from Pydantic v1 across its sample classes for data modeling and validation. With the release of Pydantic v2, several API changes and...
**Description:** This issue aims to integrate the **MTS-Dialog** dataset into the LangTest framework, enabling clinical summarization evaluation. The goal is to support structured, medically accurate summarization assessments using this domain-specific...
**Background**: [MLCommons ](https://ailuminate.mlcommons.org/benchmarks/) is a global AI engineering consortium that focuses on improving **accuracy**, **safety**, **speed**, and **efficiency** of AI systems through open collaboration and standardized benchmarks. Their mission includes...
* reforms the architecture * introduce the modular approach