SIMS icon indicating copy to clipboard operation
SIMS copied to clipboard

Queue Monitoring

Open ninosamson opened this issue 1 year ago • 0 comments

User Story As a product team, we need to monitor the queues and associated SFTP folders to diagnose errors in processing the files, and confirm archiving behaviour for failed files.

Acceptance Criteria

  • [ ] Review monitoring processes queues (and associated SFTP folders) for errors, failures.
  • [ ] Investigate and implement basic alerting as available through bull framework.
  • [ ] If not available through Bull, create new ticket for implementation of alerting.
  • [ ] Consider documenting scenarios that IMB team needs to address with manual processes.
  • [ ] Nice to have.
    • [ ] Create some POC towards the acceptable plan.
    • [ ] Include the team in the discussions and get the team's understanding.
    • [ ] Check the possibility of using global event handlers to send notifications, for instance, create an email.
    • [ ] Check if sysdig can get some status from the queues.
    • [ ] Check queues failing silently and the effort to force them to fail in the dashboard. If possible have them fixed.

Note for IMB team:

  • Following this ticket, SIMS IMB team will develop recommended process for investigation, tracking of incidents/tickets and retries to address failed jobs/files.
  • Also need to confirm and implement desired behaviour with regards to failed files (archive or move to another folder).
    • Example: Currently PT and FT feedback integration files are not being archived if they are erroring, even though they are non-sequential.

@michesmith to create epic to track this and subsequent tickets.

ninosamson avatar Aug 28 '24 20:08 ninosamson