scrapyd
scrapyd copied to clipboard
Delete job from finished jobs
I am maintaining a different table with all the finished tasks in my Django project, so I would like to delete the tables in the scrapyd finished jobs list. Then I can move a task to my table and delete it from here. Is there a simpler solution to this? If not, then I have added this functionality and will put a pr soon.
Not 100% sure to understand your use case. Take a look at finished to keep and jobstorage which you could try by using the master branch.
@mxdev88 I have built a small internal tool in my company to track scraping tasks that ran on the server. But some tasks fail and we do not want to keep those tasks in the list of the finished tasks. So, I have written an API (with the web interface too) to delete a task from the finished list using the task id. Hope it's more clear now.
I suppose an API endpoint deljob.json
could be added similar to delversion for this purpose.
Personally, I think it could be useful. I would let the project maintainers comment on the idea. Feel free to submit a PR.
This seems reasonable!
Coooll.. Will try to submit the PR this weekend. :D
Someone created the Pr?
I wonder how this feature would be expected to behave if one calls it for pending or running jobs. Shall we consider that one calling deljob
would behave as a cancel
for pending and running jobs? or fail with some error message?
It seems there would be some sort of overlap in the two features. Maybe the deljob
would supersede cancel
and cancel
would eventually be deprecated. Any thoughts?
What state does a canceled job end up in?
I think there’s a semantic difference between canceling a running job and deleting a finished job. Only one of the two involves interrupting a process. (Similar to stopping vs removing a container.)
I would keep the APIs separate.
Oh I totally forgot about this PR. 🙇🏻♂️ I will try to finish it. I had written the code, but changed my laptop so mostly will write it again.
But I agree with @jpmckinney that it should be kept separate. Trying to delete a running process should return an error to cancel the process or let it finish. Thanks for making this issue active again 🙇🏻♂️
What state does a canceled job end up in?
Looking at the code it gets removed from the queue if pending or killed if running so no state; the job disappears.
I think there’s a semantic difference between canceling a running job and deleting a finished job. Only one of the two involves interrupting a process. (Similar to stopping vs removing a container.)
Yep, fully agree on the semantic difference. I was just wondering because in the end the cancel removes the job as if it never existed, which is sort of a deletion.
I would keep the APIs separate.
ok :)
Aha - presently the state change is the same, but I can imagine in the future that cancelling a running job puts it in the end-state "interrupted" rather than deleting it. (If we were to implement this in the future, then deljob could delete either interrupted or finished jobs.)
Makes sense!