tfx-addons icon indicating copy to clipboard operation
tfx-addons copied to clipboard

A load test of the exported model

Open michalbrys opened this issue 3 years ago • 3 comments

It will be helpful to have a component (dedicated of extension to the InfraValidator TFX Pipeline Component) that performs a load test of the exported model.

The expected behavior is to load the exported model, create a TensorFlow Serving endpoint, send requests to the prediction endpoint, and measure the response time. The load test may be performed using the common open-source software like Locust / Vegeta for HTTP or ghz for gRPC protocol.

The motivation behind it is that the prediction time may vary on the model type and structure. With this component, we can initially check if the model will meet the prediction time's business requirements.

michalbrys avatar Apr 13 '21 14:04 michalbrys

Larry and Hannes both confirmed that they have a requirement for low-latency serving that would be benefitted by this. Questions about pushing and how to structure the pipeline. No blessing on test Serving instance, but would be a required blessing for production Push. Difficulty with exactly reproducing the production environment. InfraValidator does something similar, might be using TF.Serving. Similar to Evaluator, we could perhaps compare two models to assess the performance difference with the current test infra.

rcrowe-google avatar Apr 28 '21 18:04 rcrowe-google

See project proposal

rcrowe-google avatar May 26 '21 02:05 rcrowe-google

Is this currently being worked on?

sayakpaul avatar Nov 24 '21 04:11 sayakpaul