model-registry icon indicating copy to clipboard operation
model-registry copied to clipboard

feat(catalog): Performance recommendations endpoint

Open pboyd opened this issue 3 months ago • 1 comments

Description

Implements new GET /api/model_catalog/v1alpha1/sources/{source_id}/models/{model_name}/artifacts/performance endpoint that returns performance metrics artifacts, optionally filtered for Pareto-optimal configurations.

Key changes:

  • Add getAllModelPerformanceArtifacts OpenAPI operation and handler
  • Support targetRPS and recommendations query parameters for filtering
  • Calculate replicas and total_requests_per_second based on targetRPS
  • Add database models and service layer for performance artifacts

How Has This Been Tested?

Unit tests and local testing.

$ curl -s 'http://localhost:8082/api/model_catalog/v1alpha1/sources/some-source/models/some-model/artifacts/performance?targetRPS=10&orderBy=hardware_type.string_value&sortOrder=ASC&recommendations=true&latencyProperty=ttft_p99&pageSize=15' | jq -c '.items[].customProperties | [.hardware_type.string_value, .hardware_count.int_value, .replicas.int_value, .total_requests_per_second.double_value, .ttft_p99.double_value]'
["A100-40","1","2",12,74.28693771362305]
["A100-40","2","3",12,60.02593040466309]
["A100-40","4","10",10,30.0593376159668]
["A100-80","2","10",10,33.75029563903809]
["A100-80","1","5",10,63.61722946166992]
["A100-80","1","2",10,71.29526138305664]
["H100","2","5",10,32.59897232055664]
["H100","2","10",10,21.32272720336914]
["H100","2","3",12,36.24486923217773]
["H100","1","2",10,42.36221313476562]
["L4","4","5",10,189.4130706787109]
["L4","1","5",10,258.9282989501953]
["L4","4","10",10,168.4966087341309]
["L4","2","5",10,214.5683765411377]

Merge criteria:

  • All the commits have been signed-off (To pass the DCO check)
  • [x] The commits have meaningful messages
  • [x] Automated tests are provided as part of the PR for major new functionalities; testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • [x] The developer has manually tested the changes and verified that the changes work.
  • [x] Code changes follow the kubeflow contribution guidelines.

pboyd avatar Nov 20 '25 14:11 pboyd

/retest

pboyd avatar Dec 04 '25 20:12 pboyd

/approve

pboyd avatar Dec 04 '25 20:12 pboyd

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pboyd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

google-oss-prow[bot] avatar Dec 04 '25 20:12 google-oss-prow[bot]