feat(catalog): Performance recommendations endpoint
Description
Implements new GET /api/model_catalog/v1alpha1/sources/{source_id}/models/{model_name}/artifacts/performance endpoint that returns performance metrics artifacts, optionally filtered for Pareto-optimal configurations.
Key changes:
- Add getAllModelPerformanceArtifacts OpenAPI operation and handler
- Support targetRPS and recommendations query parameters for filtering
- Calculate replicas and total_requests_per_second based on targetRPS
- Add database models and service layer for performance artifacts
How Has This Been Tested?
Unit tests and local testing.
$ curl -s 'http://localhost:8082/api/model_catalog/v1alpha1/sources/some-source/models/some-model/artifacts/performance?targetRPS=10&orderBy=hardware_type.string_value&sortOrder=ASC&recommendations=true&latencyProperty=ttft_p99&pageSize=15' | jq -c '.items[].customProperties | [.hardware_type.string_value, .hardware_count.int_value, .replicas.int_value, .total_requests_per_second.double_value, .ttft_p99.double_value]'
["A100-40","1","2",12,74.28693771362305]
["A100-40","2","3",12,60.02593040466309]
["A100-40","4","10",10,30.0593376159668]
["A100-80","2","10",10,33.75029563903809]
["A100-80","1","5",10,63.61722946166992]
["A100-80","1","2",10,71.29526138305664]
["H100","2","5",10,32.59897232055664]
["H100","2","10",10,21.32272720336914]
["H100","2","3",12,36.24486923217773]
["H100","1","2",10,42.36221313476562]
["L4","4","5",10,189.4130706787109]
["L4","1","5",10,258.9282989501953]
["L4","4","10",10,168.4966087341309]
["L4","2","5",10,214.5683765411377]
Merge criteria:
- All the commits have been signed-off (To pass the
DCOcheck)
- [x] The commits have meaningful messages
- [x] Automated tests are provided as part of the PR for major new functionalities; testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
- [x] The developer has manually tested the changes and verified that the changes work.
- [x] Code changes follow the kubeflow contribution guidelines.
/retest
/approve
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: pboyd
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [pboyd]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment