boto3
boto3 copied to clipboard
Clients and resources should be cached
When creating a vanilla client like session = boto3.Session(); s3 = session.client("s3")
, the client returned is not cached. I can't see any reason why this client would be different between different calls, and the time to create a client can be quite long. Especially given that it's generally better to pass a session around than the clients directly, it'd be nice if the client was cached on first load.
It wouldn't use caching if credentials or a Config
object were passed, and it could maybe even have a single cache entry per service. Maybe it could be better solved with more caching in botocore.
It's accomplishable today with the standard library's functools
, though with a "developer beware" label.
import functools
import boto3
boto3.Session.client = functools.cache(boto3.Session.client)
boto3.Session.resource = functools.cache(boto3.Session.resource)
Thanks @benkehoe for the feature request. I brought this up for discussion with the team and they had some concerns around the implementation and backwards compatibility. This could potentially be an opt-in feature. But it is worth discussing more the proposed benefits and use cases.
I think my ideal opt-in interface might be like
session = boto3.Session(caching=True) # maybe allow an option for a cache object to be provided
client1 = session.client("s3")
client2 = session.client("s3")
assert client2 is client1
# allow the cache to be circumvented if needed
client3 = session.client("s3", caching=False)
assert client3 is not client1
Module-level functions would be served by the existing setup_default_session()
function
boto3.setup_default_session(caching=True)
client1 = boto3.client("s3")
client2 = boto3.client("s3")
assert client2 is client1
@tim-finnigan Are there any problem that could arise with caching the client? For instance, if I run in the context of kubernetes (EKS) with IAM Role for Service Account where the token rotates every hour, does it mean that a made to the cached client right after the token rotated will fail with a nice 403?
If I remember correctly, the Session holds the configuration for the credentials so if that is cached and the token is cached, this will not work after rotation.
Correct me if I'm wrong.
Client caching doesn't affect credential refreshing. The credential provider that handles web identity tokens (used for EKS service roles) automatically deals with expiration and refreshing. You don't need to get a new client for those credentials to get refreshed.
@benkehoe great. Is this the same concept with the session e.g. can we cache the session?
@mbelang The session itself represents configuration and credentials. It doesn't need caching internal to boto3, but it's intended to be passed around in your code wherever clients/resources are needed (and are intended to use the same config/credentials), i.e., "cached" in your code. The refreshable credentials for web identity, for example, are refreshed by the session for any client created on the session. I wrote an explainer on why to use sessions. For example, when you create a library that makes AWS API calls, it should take an optional session as input (creating one itself using boto3.Session()
if none is provided), and then get the appropriate client from the session. This pattern can lead to clients being created on the session multiple times in different places, which is why client caching would be beneficial.
Yeah read your article couple weeks ago and I didn't realize I was talking to you. What you explain is pretty clean and I do understand what I need to do now :)
hi, any update on this request ?
this would help speed up our use case as well.
Could this have an adverse effect on pytest/moto, e.g. some tests are mocked, some not - which results in all tests sharing the same session/client?
Could this have an adverse effect on pytest/moto, e.g. some tests are mocked, some not - which results in all tests sharing the same session/client?
That's why it needs to be opt-in, rather than enabled by default. Separately, I would argue every test should create its own session (and to start with, tests should use sessions, rather than the module-level functions which all share the same default session).