[issue-3764] [P SDK] [FE] [BE] [Docs] Introduce experiment scores
Details
Introduces the concept of experiment scores which allow you to log experiment level scores based on experiment results. This allows you to log metrics like f1-score, recall or last for example.
from typing import List
from opik.evaluation import evaluate, test_result
from opik.evaluation.metrics import Hallucination, score_result
# Define an experiment score function
def compute_hallucination_max(
test_results: List[test_result.TestResult],
) -> List[score_result.ScoreResult]:
"""Compute the maximum hallucination score across all test results."""
hallucination_scores = [
result.score_results[0].value
for result in test_results
if result.score_results and len(result.score_results) > 0
]
if not hallucination_scores:
return []
return [
score_result.ScoreResult(
name="hallucination_metric (max)",
value=max(hallucination_scores),
reason=f"Maximum hallucination score across {len(hallucination_scores)} test cases"
)
]
# Run evaluation with experiment scores
evaluation = evaluate(
dataset=dataset,
task=evaluation_task,
scoring_metrics=[Hallucination()],
experiment_scores=[compute_hallucination_max],
experiment_name="My experiment"
)
# Access experiment scores from the result
print(f"Experiment scores: {evaluation.experiment_scores}")
In the FE, the following places have been updated:
- Evaluation table in home page
- Experiment list page: Chart and table was updated with special care taken to support groups and sorting
- Single experiment page: Tags top of page and feedback scores table where updated
The documentation was also updated to include this feature.
Change checklist
- [x] User facing
- [x] Documentation update
Issues
- Resolves #3764
- OPIK-2884
Testing
SDK and BE tests were added. Manual testing was also completed.
Documentation
Documentation was updated
π PR Linter Failed
β Invalid Title Format. Your PR title must include a ticket/issue number and may optionally include component tags ([FE], [BE], etc.).
-
Internal contributors: Open a JIRA ticket and link to it:
[OPIK-xxxx]or[CUST-xxxx]or[DEV-xxxx] [COMPONENT] Your change -
External contributors: Open a Github Issue and link to it via its number:
[issue-xxxx] [COMPONENT] Your change -
No ticket: Use
[NA] [COMPONENT] Your change(Issues section not required)
Example: [issue-3108] [BE] [FE] Fix authentication bug or [OPIK-1234] Fix bug or [NA] Update README
πΏ Preview your docs: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel β Broken link: https://docs.gretel.ai/create-synthetic-data/gretel-data-designer/ (404)
Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel β Broken link: https://docs.gretel.ai/ (404)
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel/ β Broken link: https://docs.gretel.ai/ (404)
Page: https://opik-preview-06b36964-db36-4532-ab53-84f307f90b51.docs.buildwithfern.com/docs/opik/integrations/gretel/ β Broken link: https://docs.gretel.ai/create-synthetic-data/gretel-data-designer/ (404)
SDK E2E Tests Results
108 testsβββ107 β ββ5m 33s β±οΈ ββ1 suitesββββ0 π€ ββ1 filesββββββ1 β
For more details on these failures, see this check.
Results for commit 035b08b0.
:recycle: This comment has been updated with latest results.
Backend Tests Results
ββ351 filesβ Β±ββββ0ββββ351 suitesβ Β±0βββ55m 48s β±οΈ + 6m 34s 5β884 tests +βββ13ββ5β877 β +βββ13ββ7 π€ Β±0ββ0 β Β±0β 5β857 runsβ +1β184ββ5β850 β +1β184ββ7 π€ Β±0ββ0 β Β±0β
Results for commit 05cfbe19.βΒ± Comparison against base commit 0ee8b93f.
:recycle: This comment has been updated with latest results.
π Test environment deployment started
Building images for PR #3989...
You can monitor the build progress here.
SDK Unit Tests Results
0 testsβββ0 β ββ0s β±οΈ 0 suitesββ0 π€ 0 filesββββ0 β
Results for commit be644463.
:recycle: This comment has been updated with latest results.
β Test environment is now available!
Access Information
- URL: https://pr-3989.dev.comet.com
- Cluster: comet-ml-development
- Namespace: pr-3989
- Version: 1.9.25-3989-merge-471
- Application logs: View in Grafana
The deployment has completed successfully and the version has been verified.
πΏ Preview your docs: https://opik-preview-95e6e65d-fc11-4eec-afca-68581a05c3df.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: https://opik-preview-95e6e65d-fc11-4eec-afca-68581a05c3df.docs.buildwithfern.com/docs/opik/integrations/openrouter β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
Page: https://opik-preview-95e6e65d-fc11-4eec-afca-68581a05c3df.docs.buildwithfern.com/docs/opik/integrations/openrouter/ β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
πΏ Preview your docs: https://opik-preview-e70c5165-2cb7-4e98-937d-ec7138719984.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: https://opik-preview-e70c5165-2cb7-4e98-937d-ec7138719984.docs.buildwithfern.com/docs/opik/integrations/openrouter β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
Page: https://opik-preview-e70c5165-2cb7-4e98-937d-ec7138719984.docs.buildwithfern.com/docs/opik/integrations/openrouter/ β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
πΏ Preview your docs: https://opik-preview-7584f444-cd0b-4d23-bb0b-f50bc1048f1d.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: https://opik-preview-7584f444-cd0b-4d23-bb0b-f50bc1048f1d.docs.buildwithfern.com/docs/opik/integrations/openrouter β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
Page: https://opik-preview-7584f444-cd0b-4d23-bb0b-f50bc1048f1d.docs.buildwithfern.com/docs/opik/integrations/openrouter/ β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
πΏ Preview your docs: https://opik-preview-62c33358-d2d6-4259-99da-7d0c8a6477ab.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: https://opik-preview-62c33358-d2d6-4259-99da-7d0c8a6477ab.docs.buildwithfern.com/docs/opik/integrations/openrouter β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
Page: https://opik-preview-62c33358-d2d6-4259-99da-7d0c8a6477ab.docs.buildwithfern.com/docs/opik/integrations/openrouter/ β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
πΏ Preview your docs: https://opik-preview-140dcb73-c2d5-4c78-ac19-4854cad0760c.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: https://opik-preview-140dcb73-c2d5-4c78-ac19-4854cad0760c.docs.buildwithfern.com/docs/opik/integrations/openrouter β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
Page: https://opik-preview-140dcb73-c2d5-4c78-ac19-4854cad0760c.docs.buildwithfern.com/docs/opik/integrations/openrouter/ β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
πΏ Preview your docs: https://opik-preview-0ecc11c1-6d80-4c10-bea6-df842e35f287.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: https://opik-preview-0ecc11c1-6d80-4c10-bea6-df842e35f287.docs.buildwithfern.com/docs/opik/integrations/openrouter β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
Page: https://opik-preview-0ecc11c1-6d80-4c10-bea6-df842e35f287.docs.buildwithfern.com/docs/opik/integrations/openrouter/ β Broken link: https://openrouter.ai/docs/features/structured-outputs (404)
πΏ Preview your docs: https://opik-preview-4c23b1ab-31c8-45d2-92b5-2d2de9b5a711.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-63e0d6a8-faf3-482f-952a-d82e8a8a7539.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-1bafdf14-4fb7-4a35-ba18-605817e5f2fd.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-f0cedb24-2a41-4fd6-82a7-c350acb4111b.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-da8ebe70-9eac-40c9-a55c-594d9b28a80c.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
πΏ Preview your docs: https://opik-preview-512fe502-b15c-45f9-929e-aac86d178192.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-383407fc-4d2f-4904-b955-9e0dddf8078c.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-3b06baff-eb81-4729-a713-8c052723751f.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
πΏ Preview your docs: https://opik-preview-c275c1af-84fa-4f0d-9799-0547cc26366f.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-e51eaae6-223e-4ced-bf0a-a2c8aa653b7d.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-e07dcb06-a502-4959-bfec-622888071661.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
πΏ Preview your docs: https://opik-preview-b8fcf5f2-022e-4272-b45e-777cae5bd233.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-05c6ebd8-9a01-452e-a1c1-ddc45af84595.docs.buildwithfern.com/docs/opik
The following broken links where found:
Page: β Broken link: {} ()
Page: β Broken link: ] ()
Page: β Broken link: {} ()
Page: β Broken link: ] ()
πΏ Preview your docs: https://opik-preview-60ee643e-4042-409c-91b3-31aac10837e0.docs.buildwithfern.com/docs/opik
No broken links found
πΏ Preview your docs: https://opik-preview-ef03bf0c-39ed-4a08-83f7-b25966327d41.docs.buildwithfern.com/docs/opik
No broken links found
π Test environment deployment started
Building images for PR #3989...
You can monitor the build progress here.
β Test environment is now available!
Access Information
- URL: https://pr-3989.dev.comet.com
- Cluster: comet-ml-development
- Namespace: pr-3989
- Version: 1.9.32-3989-merge-562
- Application logs: View in Grafana
The deployment has completed successfully and the version has been verified.