langfuse-python feat: Add comment field to langfuse Evaluation from autoevals metadata

This PR enhances the create_evaluator_from_autoevals function to properly propagate the comment field from the autoevals evaluator's metadata to the Langfuse Evaluation object.

Previously, comments within autoevals metadata were not explicitly mapped to the Langfuse Evaluation object. This update ensures that these comments are now correctly surfaced in Langfuse, enhancing the clarity and detail of evaluation results, and enabling the display of a comment icon next to the score value in the Langfuse UI for immediate contextual feedback.

Before: Screenshot 2025-11-25 at 10 16 22 PM

After: Screenshot 2025-11-25 at 10 16 06 PM

[!IMPORTANT] Enhances create_evaluator_from_autoevals to include comment from autoevals metadata in Evaluation, improving Langfuse UI feedback.

Behavior:

create_evaluator_from_autoevals in langfuse/experiment.py now includes comment from autoevals metadata in Evaluation.

Displays comment icon next to score in Langfuse UI for contextual feedback.

^{This description was created by}^{for ee71b328254601b21a76733e1ce675ea200a8f05. You can customize this summary. It will automatically update as commits are pushed.}

Disclaimer: Experimental PR review

Greptile Overview

Greptile Summary

This PR extracts the comment field from autoevals evaluator metadata and passes it to the Langfuse Evaluation object, enabling comment display in the Langfuse UI.

Key changes:

Modified create_evaluator_from_autoevals in langfuse/experiment.py:1044 to extract comment using evaluation.metadata.get("comment")
Maintained backward compatibility by keeping metadata field intact
Enables UI to display comment icon next to score values

Issue found:

Potential AttributeError if evaluation.metadata is None (though autoevals typically guarantees dict)

Confidence Score: 4/5

This PR is safe to merge with minimal risk
The change is straightforward and adds useful functionality. One potential edge case exists where evaluation.metadata could be None, causing an AttributeError, though autoevals library typically guarantees metadata is always a dict. The change is minimal and well-scoped.
No files require special attention beyond reviewing the suggested null-safety improvement

Important Files Changed

File Analysis

Filename	Score	Overview
langfuse/experiment.py	4/5	Added extraction of `comment` field from autoevals metadata to Langfuse Evaluation object. Minor risk if `evaluation.metadata` is None, though autoevals typically guarantees dict.

Sequence Diagram

sequenceDiagram
    participant User
    participant create_evaluator_from_autoevals
    participant langfuse_evaluator
    participant autoevals_evaluator
    participant Evaluation

    User->>create_evaluator_from_autoevals: Call with autoevals_evaluator
    create_evaluator_from_autoevals->>langfuse_evaluator: Return wrapper function
    User->>langfuse_evaluator: Call with input, output, expected_output
    langfuse_evaluator->>autoevals_evaluator: Evaluate (input, output, expected)
    autoevals_evaluator-->>langfuse_evaluator: Return evaluation result (name, score, metadata)
    langfuse_evaluator->>langfuse_evaluator: Extract comment from metadata.get("comment")
    langfuse_evaluator->>Evaluation: Create with name, value, comment, metadata
    Evaluation-->>langfuse_evaluator: Return Evaluation object
    langfuse_evaluator-->>User: Return Evaluation

Nov 26 '25 03:11 rmaceissoft

All committers have signed the CLA.

Nov 26 '25 03:11 CLAassistant

Thanks a lot for your contribution @rmaceissoft !

Dec 01 '25 14:12 hassiebp