baml
baml copied to clipboard
Support token caching for ephemeral token acquisition
Requesting an authentication token on each LLM call adds to the overall latency. Currently, this is only an issue for vertex, but may scale if future providers also require ephemeral tokens.