django-celery-monitor
django-celery-monitor copied to clipboard
Save args, kwargs with JSON endcoding
Currently, calling a task with arguments and kwarguments results in something like this:
>>> task.args
[True]
>>> task.kwargs
{'arg1': 'some_string', 'arg2': False}
This is a bit of a pain when parsing this information, in particular to send it to a Javascript frontend, since it's almost JSON but not quite.
Could there be a setting to enable saving this information as properly encoded JSON? e.g.
>>> task.args
[true]
>>> task.kwargs
{"arg1": "some_string", "arg2": false}
I just came across this, and I think args, kwargs and result should all be saved as JSON-encoded strings. I accept that some data cannot be properly handled, for those just stringify anything that causes json.JSONEncoder.default() to raise a TypeError.
I'd be happy to submit a PR for this if deemed acceptable.
@ShaheedHaque That sounds fantastic to me if you were happy to do that. I'm not a project maintainer though, so would be good to hear from @jezdez if this was something he would merge. It might be good to add a flag so that anyone's existing parsing of these values is not borken.
Indeed (and actually, I guess there is a conversation to be had about results, since AFAIK, that is allowed to be a non-JSON value, e.g. a bare int).
Task arguments can be any valid Python type and will only be serialized with the configured task serializer when sent between Celery clients and workers.
The goal of this package is to use the task and worker event state to conduct monitoring, which in turn provides its values verbatim without serialization -- by design. So in other words we're erring on the side of correctness instead of convenience. There is also an additional operational risk of converting the arguments to JSON during storing that could lead to monitoring race conditions if for example the conversion to JSON fails and prevents updating the task state in the database.
There are a few options to get what you want nevertheless (with the caveat that you'd be on your own):
- subclass
django_celery_monitor.camera.Camera
, override theupdate_task
method (and calling the parentupdate_task
method first to continue the usual functionality) and store the arguments in JSON (or whatever form is convenient for you) in a separate datastorage (e.g. a separate data model) - we add a Django signal to this package (e.g.
celery_task_monitored
) so you can do option 1 without subclassing, the rest stays the same - post-process task state updates using Django's
post_save
signal and convert the arguments to the format you require, and store it in a separate table
Thanks for the quick response. Is there a way to know, for a given event, what serializer was used? I don't see a content_type field in the model, for example?
@jezdez I just added this debug into camera.py:
@@ -85,6 +85,7 @@
(task.worker.hostname, task.worker),
)
+ logger.warning('type(task.kwargs)={}: {}'.format(type(task.kwargs), task.kwargs))
defaults = {
'name': task.name,
'args': task.args,
And the resulting debug indicates that kwargs has, IIUC, already been coerced into a string even before being written to the TextField in the database:
2018-02-14 20:05:12,823 [WARNING] django_celery_monitor.camera: type(task.kwargs)=<class 'str'>: {'client': 8, 'company': 3, 'frequency': 'w1', 'next_T': '2018-10-07'} 2018-02-14 20:05:14,850 [WARNING] django_celery_monitor.camera: type(task.kwargs)=<class 'NoneType'>: None 2018-02-14 20:05:14,854 [WARNING] django_celery_monitor.camera: type(task.kwargs)=<class 'str'>: {'client': 8, 'company': 3, 'frequency': 'w1', 'next_T': '2018-10-07'} 2018-02-14 20:05:14,881 [WARNING] django_celery_monitor.camera: type(task.kwargs)=<class 'NoneType'>: None
Given that kwargs definitely started life as a dict, and what you confirmed about the intent being to use a loss-less on the wire format, this suggests that some unexpected string coercion is going on, right?
Also, given that the value is being stored in a database TextField, are we certain that the stored value would not be reduced to a string by virtue of being stored like this?
I've come across this before I think. The camera receives args and kwargs already coerced into a string, so your options are to parse JSON-ish django string repr to actual JSON (what I'm doing at the moment), or change celery presumably quite fundamentally somewhere else so that the values arrive in the camera as JSON-encoded strings to begin with.
Yes, that's what I've concluded/done too. Maybe close the issue?