opik icon indicating copy to clipboard operation
opik copied to clipboard

[issue-3764] [P SDK] [FE] [BE] [Docs] Introduce experiment scores

Open jverre opened this issue 2 months ago β€’ 30 comments

Details

Introduces the concept of experiment scores which allow you to log experiment level scores based on experiment results. This allows you to log metrics like f1-score, recall or last for example.

from typing import List
from opik.evaluation import evaluate, test_result
from opik.evaluation.metrics import Hallucination, score_result

# Define an experiment score function
def compute_hallucination_max(
    test_results: List[test_result.TestResult],
) -> List[score_result.ScoreResult]:
    """Compute the maximum hallucination score across all test results."""
    hallucination_scores = [
        result.score_results[0].value 
        for result in test_results 
        if result.score_results and len(result.score_results) > 0
    ]
    
    if not hallucination_scores:
        return []
    
    return [
        score_result.ScoreResult(
            name="hallucination_metric (max)",
            value=max(hallucination_scores),
            reason=f"Maximum hallucination score across {len(hallucination_scores)} test cases"
        )
    ]

# Run evaluation with experiment scores
evaluation = evaluate(
    dataset=dataset,
    task=evaluation_task,
    scoring_metrics=[Hallucination()],
    experiment_scores=[compute_hallucination_max],
    experiment_name="My experiment"
)

# Access experiment scores from the result
print(f"Experiment scores: {evaluation.experiment_scores}")
Screenshot 2025-11-07 at 18 15 06

In the FE, the following places have been updated:

  1. Evaluation table in home page
  2. Experiment list page: Chart and table was updated with special care taken to support groups and sorting
  3. Single experiment page: Tags top of page and feedback scores table where updated

The documentation was also updated to include this feature.

Change checklist

  • [x] User facing
  • [x] Documentation update

Issues

  • Resolves #3764
  • OPIK-2884

Testing

SDK and BE tests were added. Manual testing was also completed.

Documentation

Documentation was updated

jverre avatar Nov 07 '25 18:11 jverre

πŸ“‹ PR Linter Failed

❌ Invalid Title Format. Your PR title must include a ticket/issue number and may optionally include component tags ([FE], [BE], etc.).

  • Internal contributors: Open a JIRA ticket and link to it: [OPIK-xxxx] or [CUST-xxxx] or [DEV-xxxx] [COMPONENT] Your change
  • External contributors: Open a Github Issue and link to it via its number: [issue-xxxx] [COMPONENT] Your change
  • No ticket: Use [NA] [COMPONENT] Your change (Issues section not required)

Example: [issue-3108] [BE] [FE] Fix authentication bug or [OPIK-1234] Fix bug or [NA] Update README

github-actions[bot] avatar Nov 07 '25 18:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel ❌ Broken link: https://docs.gretel.ai/create-synthetic-data/gretel-data-designer/ (404)

Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel ❌ Broken link: https://docs.gretel.ai/ (404)

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel/ ❌ Broken link: https://docs.gretel.ai/ (404)

Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel/ ❌ Broken link: https://docs.gretel.ai/create-synthetic-data/gretel-data-designer/ (404)

github-actions[bot] avatar Nov 07 '25 18:11 github-actions[bot]

SDK E2E Tests Results

108 tests   107 βœ…β€ƒβ€ƒ5m 33s ⏱️   1 suites    0 πŸ’€   1 files      1 ❌

For more details on these failures, see this check.

Results for commit 035b08b0.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Nov 07 '25 18:11 github-actions[bot]

Backend Tests Results

β€‡β€ˆ351 files  Β±β€‡β€ˆβ€‡β€‡0β€‚β€ƒβ€‡β€ˆ351 suites  Β±0   55m 48s ⏱️ + 6m 34s 5β€ˆ884 tests +β€‡β€ˆβ€‡13  5β€ˆ877 βœ… +β€‡β€ˆβ€‡13  7 πŸ’€ Β±0  0 ❌ Β±0  5β€ˆ857 runsβ€Š +1β€ˆ184  5β€ˆ850 βœ… +1β€ˆ184  7 πŸ’€ Β±0  0 ❌ Β±0 

Results for commit 05cfbe19. ± Comparison against base commit 0ee8b93f.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Nov 07 '25 18:11 github-actions[bot]

πŸ”„ Test environment deployment started

Building images for PR #3989...

You can monitor the build progress here.

github-actions[bot] avatar Nov 20 '25 14:11 github-actions[bot]

SDK Unit Tests Results

0 tests   0 βœ…β€ƒβ€ƒ0s ⏱️ 0 suites  0 πŸ’€ 0 files    0 ❌

Results for commit be644463.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Nov 20 '25 14:11 github-actions[bot]

βœ… Test environment is now available!

Access Information

  • URL: https://pr-3989.dev.comet.com
  • Cluster: comet-ml-development
  • Namespace: pr-3989
  • Version: 1.9.25-3989-merge-471
  • Application logs: View in Grafana

The deployment has completed successfully and the version has been verified.

CometActions avatar Nov 20 '25 14:11 CometActions

🌿 Preview your docs: https://opik-preview-95e6e65d-fc11-4eec-afca-68581a05c3df.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: https://opik-preview-95e6e65d-fc11-4eec-afca-68581a05c3df.docs.buildwithfern.com/docs/opik/integrations/openrouter ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

Page: https://opik-preview-95e6e65d-fc11-4eec-afca-68581a05c3df.docs.buildwithfern.com/docs/opik/integrations/openrouter/ ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

github-actions[bot] avatar Nov 24 '25 12:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-e70c5165-2cb7-4e98-937d-ec7138719984.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: https://opik-preview-e70c5165-2cb7-4e98-937d-ec7138719984.docs.buildwithfern.com/docs/opik/integrations/openrouter ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

Page: https://opik-preview-e70c5165-2cb7-4e98-937d-ec7138719984.docs.buildwithfern.com/docs/opik/integrations/openrouter/ ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

github-actions[bot] avatar Nov 24 '25 14:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-7584f444-cd0b-4d23-bb0b-f50bc1048f1d.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: https://opik-preview-7584f444-cd0b-4d23-bb0b-f50bc1048f1d.docs.buildwithfern.com/docs/opik/integrations/openrouter ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

Page: https://opik-preview-7584f444-cd0b-4d23-bb0b-f50bc1048f1d.docs.buildwithfern.com/docs/opik/integrations/openrouter/ ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

github-actions[bot] avatar Nov 24 '25 14:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-62c33358-d2d6-4259-99da-7d0c8a6477ab.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: https://opik-preview-62c33358-d2d6-4259-99da-7d0c8a6477ab.docs.buildwithfern.com/docs/opik/integrations/openrouter ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

Page: https://opik-preview-62c33358-d2d6-4259-99da-7d0c8a6477ab.docs.buildwithfern.com/docs/opik/integrations/openrouter/ ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

github-actions[bot] avatar Nov 24 '25 17:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-140dcb73-c2d5-4c78-ac19-4854cad0760c.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: https://opik-preview-140dcb73-c2d5-4c78-ac19-4854cad0760c.docs.buildwithfern.com/docs/opik/integrations/openrouter ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

Page: https://opik-preview-140dcb73-c2d5-4c78-ac19-4854cad0760c.docs.buildwithfern.com/docs/opik/integrations/openrouter/ ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

github-actions[bot] avatar Nov 24 '25 18:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-0ecc11c1-6d80-4c10-bea6-df842e35f287.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: https://opik-preview-0ecc11c1-6d80-4c10-bea6-df842e35f287.docs.buildwithfern.com/docs/opik/integrations/openrouter ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

Page: https://opik-preview-0ecc11c1-6d80-4c10-bea6-df842e35f287.docs.buildwithfern.com/docs/opik/integrations/openrouter/ ❌ Broken link: https://openrouter.ai/docs/features/structured-outputs (404)

github-actions[bot] avatar Nov 24 '25 20:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-4c23b1ab-31c8-45d2-92b5-2d2de9b5a711.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 07:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-63e0d6a8-faf3-482f-952a-d82e8a8a7539.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 09:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-1bafdf14-4fb7-4a35-ba18-605817e5f2fd.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 10:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-f0cedb24-2a41-4fd6-82a7-c350acb4111b.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 10:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-da8ebe70-9eac-40c9-a55c-594d9b28a80c.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

github-actions[bot] avatar Nov 27 '25 10:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-512fe502-b15c-45f9-929e-aac86d178192.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 10:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-383407fc-4d2f-4904-b955-9e0dddf8078c.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 12:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-3b06baff-eb81-4729-a713-8c052723751f.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

github-actions[bot] avatar Nov 27 '25 12:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-c275c1af-84fa-4f0d-9799-0547cc26366f.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 12:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-e51eaae6-223e-4ced-bf0a-a2c8aa653b7d.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 12:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-e07dcb06-a502-4959-bfec-622888071661.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

github-actions[bot] avatar Nov 27 '25 12:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-b8fcf5f2-022e-4272-b45e-777cae5bd233.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 12:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-05c6ebd8-9a01-452e-a1c1-ddc45af84595.docs.buildwithfern.com/docs/opik

The following broken links where found:

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

Page: ❌ Broken link: {} ()

Page: ❌ Broken link: ] ()

github-actions[bot] avatar Nov 27 '25 13:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-60ee643e-4042-409c-91b3-31aac10837e0.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 13:11 github-actions[bot]

🌿 Preview your docs: https://opik-preview-ef03bf0c-39ed-4a08-83f7-b25966327d41.docs.buildwithfern.com/docs/opik

No broken links found

github-actions[bot] avatar Nov 27 '25 14:11 github-actions[bot]

πŸ”„ Test environment deployment started

Building images for PR #3989...

You can monitor the build progress here.

github-actions[bot] avatar Nov 27 '25 15:11 github-actions[bot]

βœ… Test environment is now available!

Access Information

  • URL: https://pr-3989.dev.comet.com
  • Cluster: comet-ml-development
  • Namespace: pr-3989
  • Version: 1.9.32-3989-merge-562
  • Application logs: View in Grafana

The deployment has completed successfully and the version has been verified.

CometActions avatar Nov 27 '25 15:11 CometActions