Saheli Bhattacharjee
Saheli Bhattacharjee
For ``, resnet50 Offline was inferred from SingleStream scenario, the results generated by `submission_checker.py` are shown below. The power efficiency calculated is `power_efficiency (inf/J) = 3448436.324053`, which is indeed a...
This PR will add below metrics to V1, Metric Name | Type | Unit -- | -- | -- model_load_time | Provided as a log for CUDA devices | Seconds
This PR is an updated and improved version of PR #12627. Please see some discussion there.
This PR deprecates metrics with `gpu_` prefix for existing non-GPU specific metrics- - `gpu_cache_usage` - `gpu_prefix_cache_queries` - `gpu_prefix_cache_hits` and, introduce new metrics after renaming- - `kv_cache_usage` - `prefix_cache_queries` - `prefix_cache_hits`