turbinia icon indicating copy to clipboard operation
turbinia copied to clipboard

Improve Task progress tracking

Open wajihyassine opened this issue 2 years ago • 1 comments

With our move to using just Celery, Celery gives us some more functionality around how we give Task updates while the Worker is executing.

References:

  • https://docs.celeryq.dev/en/stable/userguide/calling.html#on-message
  • https://docs.celeryq.dev/en/stable/userguide/tasks.html#custom-states

Using custom states/ the update_state method, come up with a better way to track a progress of a running Task. This may have to be different for each Task/Job depending what they are checking for and how long they are running, but some ideas are:

  • Can have the progress metric be based on output file size for larger tasks such as Plaso
  • Can have a default progress metric for Tasks that should be quick in nature (checking existence of a file)
  • Can take the external programs latest stdout and provide it in the status update.
  • Can take the latest modified timestamp of the output file

wajihyassine avatar Jan 14 '23 19:01 wajihyassine

Another related thing we have talked about is allowing the task to write something like a status.txt file into the output file, and if that exists, use that as the current status. Plaso has implemented a short output that we can use for this.

aarontp avatar Feb 11 '23 00:02 aarontp