langtest icon indicating copy to clipboard operation
langtest copied to clipboard

Implement MTS-Dialog-Based Clinical Summary Evaluation

Open chakravarthik27 opened this issue 7 months ago • 0 comments
trafficstars

Description:
This issue aims to integrate the MTS-Dialog dataset into the LangTest framework, enabling clinical summarization evaluation. The goal is to support structured, medically accurate summarization assessments using this domain-specific benchmark.

Tasks:

  • Add a data loader/parser for the MTS-Dialog dataset.
  • Map MTS-Dialog fields to LangTest's summarization task schema.
  • Implement support for evaluating structured summaries (e.g., SOAP/EMR format).
  • Ensure alignment with evaluation criteria such as factual completeness, hallucination detection, and clinical relevance.

Acceptance Criteria:

  • LangTest can load and process MTS-Dialog samples.
  • Evaluation metrics specific to clinical summarization are supported.

chakravarthik27 avatar Apr 08 '25 06:04 chakravarthik27