pybluemonday
pybluemonday copied to clipboard
Error when use with celery
If this module are use with Celery worker (even if we use concurrency=1) module are freeze after ~100-350 iteration per task.
For reproduce an issue, we need:
- Virtual environment (
pyhon -m venv ./venv && source ./venv/bin/activate) with celery and pybluemonday (pip install celery pybluemonday, also we can build pybluemonday to this purpoise viapip install .); - run celery worker (
celery -A app worker --loglevel=INFO --concurrency=1) with applicationapp.py; - run application (
python app.py); - After ~350 iteration (the number varies from project to project) we paralyze worker, and to kill him we need an
kill -SIGKILL.
In the strace we show that we call a futex:
strace -p 48590
futex(0xc000056148, FUTEX_WAIT_PRIVATE, 0, NULL
app.py:
from celery import Celery
from pybluemonday import UGCPolicy, Policy
from typing import Optional, List
app = Celery('app', broker='redis://localhost')
class Sanitizer:
def __init__(self, app=None, policy: Policy=UGCPolicy(), allowed_attrs: Optional[List[str]]=["class", "style"]):
self.policy = policy
if allowed_attrs:
self.policy.AllowAttrs(*allowed_attrs).Globally()
if app:
self.init_app(app)
def init_app(self, app):
pass
def sanitize(self, text: Optional[str]) -> str:
if text is None or text == "":
return text
return self.policy.sanitize(text)
with open('../sanitized_text.txt', 'r') as f:
strings = f.read().split('\n')
sanitizer = Sanitizer()
@app.task
def task_to_check_pybluemonday(count: int):
global strings
for i in range(count):
print(f'iteration: {i}, string: ')
print(strings[i % len(strings)])
sanitizer.sanitize(strings[i % len(strings)])
print('sanitizing end')
if __name__ == '__main__':
task_to_check_pybluemonday.delay(10000)