celery-progress
celery-progress copied to clipboard
Persistant job tracking
Is there any way or a recommended way of tracking long term jobs? As in keeping them in a table for querying afterwards?
In my case I'm dealing with jobs that might take 1-2 hours and if the user leaves the page during that time I don't have access to the job id returned by the view. I was planning on creating a table for keeping track of jobs.
Any suggestions on this? Or anyone dealt with a similar situation? I would be willing to contribute if this is something that might interest other people.
Interesting use case! I haven't had to solve this on any of my own projects, but I certainly can see the value of it. Off the top of my head it's unclear to me whether something like that would belong in this lib or as a standalone thing, though happy to review any suggestions you have. (also keen to hear if any other maintainers have thoughts).
How would you be identifying/looking up the jobs? With the User who created them? I could definitely imagine a small model that just associated User objects with the task_id and maybe some extra metadata that you could then use to show a list of jobs.
How would you be identifying/looking up the jobs? With the User who created them? I could definitely imagine a small model that just associated User objects with the task_id and maybe some extra metadata that you could then use to show a list of jobs.
If you are using Django, someone has created an extension to link Celery tasks to Model instances and monitor their status: https://github.com/mback2k/django-celery-model
Using it with a User based model sounds like a great idea.
I also considered giving the task a deterministic custom id that you can re-generate when needed. But I think keeping track of ids is a better solution.
@czue that is exactly the initial idea I had. A simple model that just links task_id with User and potentially status so it can also be used to track history of tasks for each user. This would also allow to query an endpoint to retrieve unfinished non-failed tasks.
Also agree that I don't know if it belongs here or somewhere else, I love the simplicity of this project but I also feel that such simple model would not add complexity to it and could be elegant :)
I'll probably take a look at how to deal with it during the weekend. In the meantime I'll also hear comments from other maintainers
@eeintech I'll take a look at the project you mentioned later tonight to see if it fits what I'm after. Yes, I'm using celery-progress with Django.
@czue sorry for late reply, it's been a couple busy weeks.
My plan was to create a simple table and create a new recorder that is capable of writing to a simple table. This couple links I'm providing here are temporary commits (had to move computers and needed the code) so it's far from clean or done but you get the idea :)
Backend adds a new recorder class inheriting from ProgressRecorder. Same way, a simple table adds a relation between the user and the task id and potentially more info to differentiate between tasks in the project. Finally, I'm guessing a small view function that returns tasks for a given user (I have some placeholder code but haven't used it).
I need to fix it a bit since it's more kind of a concept at this point but let me know if you think something like this could be interesting!
Note that the task is not owned by the user in all projects. It could be owned by a different model.
I think persistent job tracking should be done by another library or implemented by each project that needs it.