MLServer
MLServer copied to clipboard
Add model statistics extension to MLServer
trafficstars
Triton exposes an API to provide details statistics on model usage check here: rpc ModelStatistics(ModelStatisticsRequest)
It would be good to consider something similar for MLServer. Specifically I like queue vs compute stats as the expose useful information on where the time is going for an inference request.