SIMS
SIMS copied to clipboard
Queue Monitoring
User Story As a product team, we need to monitor the queues and associated SFTP folders to diagnose errors in processing the files, and confirm archiving behaviour for failed files.
Acceptance Criteria
- [ ] Review monitoring processes queues (and associated SFTP folders) for errors, failures.
- [ ] Investigate and implement basic alerting as available through bull framework.
- [ ] If not available through Bull, create new ticket for implementation of alerting.
- [ ] Consider documenting scenarios that IMB team needs to address with manual processes.
- [ ] Nice to have.
- [ ] Create some POC towards the acceptable plan.
- [ ] Include the team in the discussions and get the team's understanding.
- [ ] Check the possibility of using global event handlers to send notifications, for instance, create an email.
- [ ] Check if sysdig can get some status from the queues.
- [ ] Check queues failing silently and the effort to force them to fail in the dashboard. If possible have them fixed.
Note for IMB team:
- Following this ticket, SIMS IMB team will develop recommended process for investigation, tracking of incidents/tickets and retries to address failed jobs/files.
- Also need to confirm and implement desired behaviour with regards to failed files (archive or move to another folder).
- Example: Currently PT and FT feedback integration files are not being archived if they are erroring, even though they are non-sequential.
@michesmith to create epic to track this and subsequent tickets.