langkit
langkit copied to clipboard
SWE-Agent: Implement the hallucinations metric from the main branch in the workflow branch
A lot of things have changed between the main branch and the workflow branch. It reimplements all of the metrics from the main branch with a different Workflow, Metric based interface. One of the metrics that have not been ported over yet is Hallucination:
- https://github.com/whylabs/langkit/blob/main/langkit/response_hallucination.py
This needs to be ported to the workflow branch and implemented like the other metrics in the workflow brach. Some examples of how that looks:
- text stat metric: https://github.com/whylabs/langkit/blob/workflow/langkit/metrics/text_statistics.py
- pii metric: https://github.com/whylabs/langkit/blob/workflow/langkit/metrics/pii.py
The metric python module is full of other metrics too.