deepeval issues

Results 49 deepeval issues

Sort by recently updated

RagasMetric breaks when used with dataset

**Describe the bug** RagasMetric breaks when used with dataset **To Reproduce** dataset.evaluate([RagasMetric]) Errors: TypeError: RagasMetric.a_measure() got an unexpected keyword argument '_show_indicator' and another self.llm.set_run_config(run_config) ^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'GPTModel' object has no...

haarisedhi102

Synthesizer only takes one document in list of documents for some edge cases

penguine-ip

Support batching for custom models

**Is your feature request related to a problem? Please describe.** I'm trying to evaluate a local LLM model using Exllamav2 and Deepbench's support for the [MMLU dataset](https://docs.confident-ai.com/docs/benchmarks-mmlu). Unfortunately the current...

alexkreidler

Synthesizer generates goldens but a dataset loads as test cases

A lot of friction between generating goldens -> loading test cases for evaluation.

penguine-ip

Config Loading from YAML & README Update

- Implemented YAML-based configuration loading for evaluation settings. - Added files - `deepeval/metrics/registry.py` to map metric names to class objects - `deepeval/metrics/loader.py` to load metrics from yaml and initalize -...

deval-shah

No test cases found error

See this code: [ ![Screenshot 2024-05-06 192539](https://github.com/confident-ai/deepeval/assets/108796323/51180459-ea64-44f3-bd18-b44b0b5c2c08) ](url) ![Screenshot 2024-05-06 192553](https://github.com/confident-ai/deepeval/assets/108796323/b6158846-5da7-485f-bc5c-4d8a7270616a) `import json import asyncio from deepeval.metrics import AnswerRelevancyMetric, SummarizationMetric, HallucinationMetric from deepeval.test_case import LLMTestCase from deepeval import assert_test import...

GantaVenkataKousik

deepeval
deepeval copied to clipboard

Metadata

RagasMetric breaks when used with dataset

Synthesizer only takes one document in list of documents for some edge cases

Support batching for custom models

Synthesizer generates goldens but a dataset loads as test cases

Config Loading from YAML & README Update

No test cases found error

Dramatically simpler and more reliable cache

Synthesizer with custom criteria

Cant use HuggingFace Model for evaluation

deepeval should return non-zero exitCode on failure

← Metadata

Owner

Metadata

deepeval deepeval copied to clipboard

Metadata

← Metadata

Owner

Metadata

deepeval
deepeval copied to clipboard