beam
beam copied to clipboard
Performance Regression or Improvement: pytorch_image_classification_benchmarks-resnet101-mean_load_model_latency_milli_secs:mean_load_model_latency_milli_secs
Performance change found in the
test: pytorch_image_classification_benchmarks-resnet101-mean_load_model_latency_milli_secs for the metric: mean_load_model_latency_milli_secs.
For more information on how to triage the alerts, please look at
Triage performance alert issues section of the README.
Test description: Pytorch image classification on 50k images of size 224 x 224 with resnet 101.
Test link - https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L34
Test dashboard - http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
timestamp: Sun Nov 19 04:26:08 2023, metric_value: 100968.61
timestamp: Sat Nov 18 04:19:32 2023, metric_value: 102409.45
timestamp: Fri Nov 17 04:19:24 2023, metric_value: 100813.11 <---- Anomaly
timestamp: Thu Nov 16 04:19:19 2023, metric_value: 71639.33
timestamp: Wed Nov 15 04:22:23 2023, metric_value: 72907.91
timestamp: Tue Nov 14 04:18:28 2023, metric_value: 69954.48
timestamp: Mon Nov 13 04:17:56 2023, metric_value: 69631.03
timestamp: Sun Nov 12 04:20:02 2023, metric_value: 68721.14
timestamp: Sat Nov 11 04:18:04 2023, metric_value: 71528.79
timestamp: Fri Nov 10 04:18:10 2023, metric_value: 67224.72
timestamp: Thu Nov 9 04:19:48 2023, metric_value: 72106.05
timestamp: Wed Nov 8 04:18:40 2023, metric_value: 74298.80
timestamp: Tue Nov 7 10:16:40 2023, metric_value: 67609.21
Seems like there is a spike -
Very nice, these are the alerts we'd want to see.
I am seeing a pytorch release: https://pypi.org/project/torch/2.1.1/ , timeline lines up. Need to check if we install latest version in the test.
@AnandInguva @damccorm in case you remember whether we reported this to Pytorch folks