Memory leak
Please fill out the sections below to help us address your issue.
What issue did you see ? When I call the AWS interface in large numbers, and cannot connect to the corresponding area due to network reasons, an EndpointConnectionError is thrown. As time goes by, the memory occupied by my process will continue to increase, and the maximum observed is currently 6G. Using gc and pyrasite to check, it is found that gc.garbage is [], and the data type that takes up the most memory is str or unicode. The unicode description is the document description of DescribeInstancesRequest in service-2.json. The library versions I use: boto3-1.12.24, botocore-1.15.24, urllib3-1.21.1
Steps to reproduce If you have a runnable example, please include it as a snippet or link to a repository/gist for larger code examples.
Debug logs Full stack trace by adding
import botocore.session
botocore.session.Session().set_debug_logger('')
to your code.

The way i get AWS client:
class AwsClient(object):
def __init__(self, region='eu-central-1', server_name='ec2'):
self.region = region
self.server_name = server_name
@property
def client(self):
return boto3.client(self.server_name,
region_name=self.region,
aws_access_key_id=ACCESS,
aws_secret_access_key=SECRET)
I used 2 process and multy green threads to make requests.And there is only one boto3.session.Session instance and one botocore.session.Session in one process.
@itachaaa - Thank you for your post. It is recommended to create a resource instance for each thread / process in a multithreaded or multiprocess application rather than sharing a single instance among the threads / processes. https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html?highlight=multithreading#multithreading-multiprocessing
Are you creating boto3 session from botocore session ? Can you please provide me your exact code sample that resulting in memory leak ?
Thanks for reply.
I use boto3.client() to get an instance and make requests.
And there is one process in my program, but with multy Coroutine rather than multy thread.
So there is one session for one process.
And only when there is throwning lots of exception the memory will increase, otherwise won't.
@itachaaa - Thanks for the reply. Is it possible for you to provide me a code sample so that i can try reproduce the issue ? Without looking at the code it is a little difficult for me to find out the exact cause.
Please make sure you are doing garbage collection as you are using multiple coroutine.
here is my test code:
from memory_profiler import profile
from eventlet.greenpool import GreenPool
from instance import InstanceResource
manager = InstanceResource()
#@profile
def get_data():
try:
instances = manager.list_resource()
except Exception as e:
print(e)
#@profile
def loop_call():
pool = GreenPool(10000)
times = 0
for i in range(100000):
pool.spawn_n(get_data)
times += 1
print(times)
import time; time.sleep(10)
if __name__ == '__main__':
loop_call()
from common import Resource
class InstanceResource(Resource):
action = 'describe_instances'
create_action = 'run_instances'
# entity = 'Instances'
@staticmethod
def get_filters():
params = {
'Filters': [
# {'Name': 'instance-id', 'Values': ['i-07a20968066e2ad87', ]}
],
}
return params
from client import AwsClient
class Resource(object):
action = None # 默认是查询
create_action = None
update_action = None
delete_action = None
entity = None
def __init__(self, region='eu-central-1', server_name='ec2'):
self._init(region=region, server_name=server_name)
def _init(self, region='eu-central-1', server_name='ec2'):
client = AwsClient(region=region, server_name=server_name)
self.client = client.client
self.resource = client.resource
@staticmethod
def get_filters():
"""
获取GET接口的过滤参数
:return:
"""
class AwsClient(object):
def __init__(self, region='eu-central-1', server_name='ec2'):
self.region = region
self.server_name = server_name
self.config = Config(retries=dict(max_attempts=2), connect_timeout=5, read_timeout=5)
@property
def client(self):
return boto3.client(self.server_name,
region_name=self.region,
aws_access_key_id=ACCESS,
aws_secret_access_key=SECRET,
config=self.config)
and let it throw errors a lots such as EndpointConnectionError,then the memory will continue to increase.
@itachaaa - Thank you for providing me the sample code. Marking this as bug. I am able to reproduce the issue with this script:
import os
import boto3
import botocore
import resource
import psutil
from resource import *
import matplotlib.pyplot as pp
import sys
from botocore.config import Config
from eventlet.greenpool import GreenPool
used =[]
def get_data():
client = boto3.client('ec2',config = Config(retries={'max_attempts':0},connect_timeout=5, read_timeout=5))
client.describe_instances()
pool = GreenPool(10000)
for i in range(100000):
process = psutil.Process(os.getpid())
memory = process.memory_info().rss/1024/1024
used.append(memory)
pool.spawn_n(get_data)
pp.plot(used)
pp.show()

I would like to ask if you have a situation where the network environment is not very good and sometimes throws an exception when you reproduce it. Because at that time there were frequent printing of network-related anomalies, such as EndpointConnectError, ReadTimeoutError and so on.I don't know if this is an influential factor.
I am also tracking down a memory leak. tracemalloc pointed me to https://github.com/boto/botocore/blob/develop/botocore/client.py#L322
When running a Flask application and looping through gc.garbage after a gc.collect() I am left with boto docs. I currently am using boto3 and creating clients as client = boto3.client('sts') as an example.
Running in lambda the following is a graph of memory from cloudwatch metrics filter:

Is there an update or workaround for this problem?
Creating one session per thread and reusing them across ThreadPool mitigated the issue for us. Snippet for aiobotocore below:
import threading
from aiobotocore.session import AioSession, get_session
# NOTE: botocore has a memory leak in Session objects. Recommended workaround is to cache the session object locally per thread.
# See https://github.com/boto/botocore/issues/2047
_aio_session_cache = threading.local()
def _cached_session() -> AioSession:
if not hasattr(_aio_session_cache, "session"):
_aio_session_cache.session = get_session()
return _aio_session_cache.session
I ran my FastAPI app and tracemalloc pointed me that python3.8/json/decoder.py file is leaking ~100MB every 15 minutes
looking into the tracebacks of that file I see that they are all being called by botocore
File \"/opt/venv/lib/python3.8/site-packages/botocore/session.py\", line 787
return self._internal_components.get_component(name)
File \"/opt/venv/lib/python3.8/site-packages/botocore/session.py\", line 1081
self._components[name] = factory()
File \"/opt/venv/lib/python3.8/site-packages/botocore/session.py\", line 188
endpoints = loader.load_data('endpoints')
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 142
data = func(self, *args, **kwargs)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 454
found = self.file_loader.load_file(possible_path)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 194
data = self._load_file(file_path + ext, open_method)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 181
return json.loads(payload, object_pairs_hook=OrderedDict)
File \"/usr/local/lib/python3.8/json/__init__.py\", line 370
return cls(**kw).decode(s)
File \"/usr/local/lib/python3.8/json/decoder.py\", line 337
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File \"/usr/local/lib/python3.8/json/decoder.py\", line 353
obj, end = self.scan_once(s, idx)
and
File \"/opt/venv/lib/python3.8/site-packages/botocore/client.py\", line 202
json_model = self._loader.load_service_model(
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 142
data = func(self, *args, **kwargs)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 417
model = self.load_data(full_path)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 142
data = func(self, *args, **kwargs)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 454
found = self.file_loader.load_file(possible_path)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 194
data = self._load_file(file_path + ext, open_method)
File \"/opt/venv/lib/python3.8/site-packages/botocore/loaders.py\", line 181
return json.loads(payload, object_pairs_hook=OrderedDict)
File \"/usr/local/lib/python3.8/json/__init__.py\", line 370
return cls(**kw).decode(s)
File \"/usr/local/lib/python3.8/json/decoder.py\", line 337
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File \"/usr/local/lib/python3.8/json/decoder.py\", line 353
obj, end = self.scan_once(s, idx)
I'm using botocore==1.27.59 and can't upgrade it since aiobotocore is fixed to ^1.27
I'll try to update my python version