celery
celery copied to clipboard
Possible memory leak in the inspect API
Checklist
- [x] I have verified that the issue exists against the
master
branch of Celery. - [ ] This has already been asked to the discussion group first.
- [x] I have read the relevant section in the contribution guide on reporting bugs.
- [x] I have checked the issues list for similar or identical bug reports.
- [x] I have checked the pull requests list for existing proposed fixes.
- [x] I have checked the commit log to find out if the bug was already fixed in the master branch.
- [x] I have included all related issues and possible duplicate issues in this issue (If there are none, check this box anyway).
Mandatory Debugging Information
- [x] I have included the output of
celery -A proj report
in the issue. (if you are not able to do this, then at least specify the Celery version affected). - [x] I have verified that the issue exists against the
master
branch of Celery. - [x] I have included the contents of
pip freeze
in the issue. - [x] I have included all the versions of all the external dependencies required to reproduce this bug.
Optional Debugging Information
- [x] I have tried reproducing the issue on more than one Python version and/or implementation.
- [ ] I have tried reproducing the issue on more than one message broker and/or result backend.
- [ ] I have tried reproducing the issue on more than one version of the message broker and/or result backend.
- [ ] I have tried reproducing the issue on more than one operating system.
- [ ] I have tried reproducing the issue on more than one workers pool.
- [ ] I have tried reproducing the issue with autoscaling, retries, ETA/Countdown & rate limits disabled.
- [x] I have tried reproducing the issue after downgrading and/or upgrading Celery and its dependencies.
Related Issues and Possible Duplicates
Related Issues
- None
Possible Duplicates
- None
Environment & Settings
Celery version:
celery report
Output:
Steps to Reproduce
Required Dependencies
- Minimal Python Version: 3.6
- Minimal Celery Version: 4.3.0
- Minimal Kombu Version: N/A or Unknown
- Minimal Broker Version: Redis 5.x
- Minimal Result Backend Version: N/A or Unknown
- Minimal OS and/or Kernel Version: N/A or Unknown
- Minimal Broker Client Version: N/A or Unknown
- Minimal Result Backend Client Version: N/A or Unknown
Minimal example:
Python Packages
pip freeze
Output:
alembic==0.9.10
amqp==2.5.2
ansible==2.8.4
arrow==0.15.5
asn1crypto==0.24.0
Authlib==0.14.1
awscli==1.16.289
backcall==0.1.0
backoff==1.10.0
bcrypt==3.1.7
beautifulsoup4==4.8.2
billiard==3.5.0.5
blist==1.3.6
boto3==1.12.14
botocore==1.15.14
cachetools==3.1.1
celery==4.2.2
certifi==2019.11.28
cffi==1.14.0
chardet==3.0.4
cloudpickle==1.2.1
colorama==0.3.9
crc16==0.1.1
cryptography==2.8
cyordereddict==1.0.0
dask==2.10.1
decorator==4.4.0
deepdiff==3.3.0
docopt==0.6.2
docutils==0.15.2
dpkt==1.9.2
fsspec==0.6.2
greenlet==0.4.15
idna==2.9
importlib-metadata==1.5.0
ipython==7.8.0
ipython-genutils==0.2.0
jedi==0.15.1
Jinja2==2.11.1
jmespath==0.9.5
jsonpickle==1.3
kombu==4.6.0
kvdr==1.0.4
locket==0.2.0
lxml==4.5.0
lz4==3.0.2
Mako==1.1.2
MarkupSafe==1.1.1
msgpack==1.0.0
msgpack-python==0.5.6
numexpr==2.7.1
numpy==1.18.1
pandas==0.22.0
pandas-datareader==0.8.1
paramiko==2.7.1
parso==0.5.1
partd==1.1.0
pexpect==4.7.0
pickleshare==0.7.5
prompt-toolkit==2.0.9
psutil==5.7.0
psycopg2-binary==2.8.4
ptyprocess==0.6.0
pudb==2019.1
pyasn1==0.4.7
pycparser==2.20
pycrypto==2.6.1
pydocstyle==1.1.1
Pygments==2.4.2
Pympler==0.8
PyNaCl==1.3.0
pyOpenSSL==19.1.0
pysftp==0.2.9
python-dateutil==2.8.1
python-editor==1.0.4
python-gnupg==0.4.5
python-redis==0.2.2
python-snappy==0.5.4
pytz==2019.3
PyYAML==5.3
rarfile==3.1
redis==3.4.1
requests==2.23.0
requests-oauth2==0.3.0
rsa==3.4.2
s3fs==0.4.0
s3transfer==0.3.3
six==1.14.0
sortedcontainers==2.1.0
soupsieve==2.0
SQLAlchemy==1.3.13
tables==3.6.1
toolz==0.10.0
traitlets==4.3.2
urllib3==1.25.8
urwid==2.0.1
vine==1.3.0
wcwidth==0.1.7
wrapt==1.11.2
xlrd==1.2.0
zipp==3.1.0
Other Dependencies
N/A
Minimally Reproducible Test Case
from myapp.app import app
from time import sleep
def print_stats():
insp = app.control.inspect()
active_lst = insp.active()
cluster_stats = insp.stats()
active_queues = insp.active_queues()
all_stats = {
"active": active_lst,
"stats": cluster_stats,
"queues": active_queues
}
print(all_stats)
def main():
while True:
print_stats()
sleep(10)
if __name__ == '__main__':
main()
Expected Behavior
No memory leaks
Actual Behavior
The memory consumption by the tiny example constantly grows. I left the script running over night and it always gets killed after trying to allocate more memory than the system has, so Linux automatically kills it.
I have also tested the script with both CPython 3.6 and PyPy 7.3.0/3.6 and in both cases they leak memory.
thanx for reporting.
I have the same issue.
Do you propose any alternatives for an healthcheck?
What I do now is - I do not run it in an infinite loop. Instead I wrapped the Python code in a tiny BASH script that runs Python process every N seconds...
Just making sure you're not running the script in DEBUG mode or similar? If django.
I'm trying to use celery to run a pytorch model on GPU, I found that after the task was completed, celery does not release the resource occupied by the task (like GPU memory and RAM). Therefore, I tried to use CELERYD_MAX_TASKS_PER_CHILD setting to kill old worker and create a new one to release resource. However, I got some error message: 'cuda runtime error: Initialization error' and Process 'ForkPoolworker exited with exitcode 1' when the maximum number of task execution is reached and celery also didn't release memory. Is this can be called memory leaking? Maybe similar to your question. celery 4.4.0 redis 3.4.1 python 3.6.6
No, no debug mode. No django. - Just plain celery.
I'm trying to use celery to run a pytorch model on GPU, I found that after the task was completed, celery does not release the resource occupied by the task (like GPU memory and RAM). Therefore, I tried to use CELERYD_MAX_TASKS_PER_CHILD setting to kill old worker and create a new one to release resource. However, I got some error message: 'cuda runtime error: Initialization error' and Process 'ForkPoolworker exited with exitcode 1' when the maximum number of task execution is reached and celery also didn't release memory. Is this can be called memory leaking? Maybe similar to your question. celery 4.4.0 redis 3.4.1 python 3.6.6
you should try celery 4.4.2+
No, no debug mode. No django. - Just plain celery.
can you try some tool to find out the root cause of memoery leak?
I'm trying to use celery to run a pytorch model on GPU, I found that after the task was completed, celery does not release the resource occupied by the task (like GPU memory and RAM). Therefore, I tried to use CELERYD_MAX_TASKS_PER_CHILD setting to kill old worker and create a new one to release resource. However, I got some error message: 'cuda runtime error: Initialization error' and Process 'ForkPoolworker exited with exitcode 1' when the maximum number of task execution is reached and celery also didn't release memory. Is this can be called memory leaking? Maybe similar to your question. celery 4.4.0 redis 3.4.1 python 3.6.6
you should try celery 4.4.2+
I tried celery 4.4.2, sadly, it doesn't work. I have a question about why doesn't celery automatically release task resources after it has been completed?Is there a bug in code? I'm not familiar with celery.
can you try some tool to find out the root cause of memoery leak?
That is one of the first things I tried. I used couple of memory profilers. Problem is that they just tell you the amount of memory allocated by certain type. In this particular case the leak is in some sort of list. But it is very had to find which list...
I was unable to reproduce this issue with this example using redis-py and this example using py-amqp. I'm using what's currently on the celery master branch.
Do I need to be actively running tasks to make the leak happen?
I let the example in my previous comment run for a few hours, and I do see a minor leak now:
celery-stats_1 | Top 10 lines
celery-stats_1 | #1: <frozen importlib._bootstrap_external>:525: 1702.8 KiB
celery-stats_1 | #2: /celery_app/kombu/kombu/pidbox.py:234: 1254.8 KiB
celery-stats_1 | f'{oid}.{self.reply_exchange.name}',
celery-stats_1 | #3: /usr/local/lib/python3.7/linecache.py:137: 1162.9 KiB
celery-stats_1 | lines = fp.readlines()
celery-stats_1 | #4: /usr/local/lib/python3.7/uuid.py:269: 1015.8 KiB
celery-stats_1 | hex[:8], hex[8:12], hex[12:16], hex[16:20], hex[20:])
celery-stats_1 | #5: /usr/local/lib/python3.7/site-packages/pympler/summary.py:132: 524.2 KiB
celery-stats_1 | rows.append([otype, count[otype], total_size[otype]])
celery-stats_1 | #6: /usr/local/lib/python3.7/site-packages/cached_property.py:74: 330.9 KiB
celery-stats_1 | return obj_dict.setdefault(name, self.func(obj))
celery-stats_1 | #7: /celery_app/redis-py/redis/connection.py:1294: 318.7 KiB
celery-stats_1 | self._in_use_connections.add(connection)
celery-stats_1 | #8: /celery_app/redis-py/redis/commands/core.py:2362: 318.7 KiB
celery-stats_1 | return self.execute_command("SADD", name, *values)
celery-stats_1 | #9: /usr/local/lib/python3.7/site-packages/pympler/summary.py:88: 305.3 KiB
celery-stats_1 | lambda f: "function (%s)" % f.__name__,
celery-stats_1 | #10: /celery_app/kombu/kombu/transport/virtual/base.py:558: 96.2 KiB
celery-stats_1 | table.append(meta)
celery-stats_1 | 3109 other: 1796.5 KiB
celery-stats_1 | Total allocated size: 8826.7 KiB
It seems to be caused by this line: https://github.com/celery/kombu/blob/507b3064004133d14b974d387200750f21309323/kombu/transport/virtual/base.py#L558
Any suggested fixes?
It seems like this issue is fixed in 5.2.x ... I no longer observe the problem with 5.2.7.