Add sample code to measure latency on TextEmbeddingModel
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
The issue you're having must be related to a file in this repository. We are unable to provide assistance for issues unrelated to samples in this repository.
Please include as much information as possible:
Is your feature request related to a problem? Please describe.
Customer is looking for an authoritative way to properly measure the latency on a given model of TextEmbeddingModel
Describe the solution you'd like.
I would like to contribute by adding a new file with a code to send a given text N times, measure the time on each request and grouping them in percentiles 50, 95 and 99.
Describe alternatives you've considered.
I have not thought on any other alternative.
Additional context.
This is useful for those early adapters who want to make sure and confirm that this is indeed a reliable solution for the enterprise.
Making sure to follow these steps will guarantee the quickest resolution possible.
Thanks!