potpie icon indicating copy to clipboard operation
potpie copied to clipboard

Update Task Status to "Errored" on Celery Timeout Exceedance

Open vineetshar opened this issue 1 year ago • 3 comments

Description: Currently, when a parsing task exceeds the approved time limit, Celery terminates the task but does not update its status to "Errored." This leads to tasks appearing incomplete or stuck without indicating the timeout failure, which can hinder error tracking and debugging.

Proposed Solution:

  • Implement logic to detect when a task is killed due to exceeding the time limit and update its status to "Errored."
  • Ensure that all tasks handle timeouts gracefully, providing clear feedback on task status for easier monitoring and troubleshooting.

Tasks:

  • [ ] Add error handling in the task code to catch Celery timeouts and update the task status to "Errored."
  • [ ] Implement logging for timeout events to aid in debugging and tracking task failures.
  • [ ] Test to verify that tasks exceeding the time limit are properly marked as "Errored" in the system.

Acceptance Criteria:

  • Tasks that exceed the Celery time limit are automatically marked as "Errored."
  • Logging is in place to capture details of timeout occurrences.

vineetshar avatar Oct 30 '24 13:10 vineetshar

Can i work on this?

Pratik-Sharma-Dev avatar Nov 07 '24 22:11 Pratik-Sharma-Dev

@Pratik-Sharma-Dev sure, have you gone through the codebase? what's your plan of action?

vineetshar avatar Nov 08 '24 05:11 vineetshar

@vineetshar @dhirenmathur , I would like to ask if @Pratik-Sharma-Dev or anyone is working on this or not, if not then I would like to work on this.

As, I have gone through the codebase (partials) and proposed solution, here is the key point upon which we have to work:

  1. Review the task timeout configuration (time_limit and soft_time_limit, i.e. mechanism to handle task execution time limits in a controlled way. It allow tasks to gracefully terminate when they exceed their allotted time without abruptly killing the process. ) for parsing_task.py and identify where timeout events can be intercepted.

  2. Modify parsing_task.py to catch the SoftTimeLimitExceeded exception and ensure the task status is updated to "Errored" in the database.

  3. Add logging for timeout events to capture details like task ID, task name, and failure time for easier debugging.

shreyashkr17 avatar Dec 16 '24 05:12 shreyashkr17