Support token caching for ephemeral token acquisition

Open anish-palakurthi opened this issue 1 year ago • 0 comments

Requesting an authentication token on each LLM call adds to the overall latency. Currently, this is only an issue for vertex, but may scale if future providers also require ephemeral tokens.

Jul 31 '24 17:07 anish-palakurthi