Add sample code to measure latency on TextEmbeddingModel

Open gheduardo opened this issue 1 year ago • 0 comments

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

The issue you're having must be related to a file in this repository. We are unable to provide assistance for issues unrelated to samples in this repository.

Please include as much information as possible:

Is your feature request related to a problem? Please describe.

Customer is looking for an authoritative way to properly measure the latency on a given model of TextEmbeddingModel

Describe the solution you'd like.

I would like to contribute by adding a new file with a code to send a given text N times, measure the time on each request and grouping them in percentiles 50, 95 and 99.

Describe alternatives you've considered.

I have not thought on any other alternative.

Additional context.

This is useful for those early adapters who want to make sure and confirm that this is indeed a reliable solution for the enterprise.

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!

Apr 04 '24 01:04 gheduardo