code-dot-org
code-dot-org copied to clipboard
Use more reliable restart strategy for ActiveJob
Slack Discussion: https://codedotorg.slack.com/archives/C051P2V2RN0/p1712167298464339
We have been experiencing issues with the restart command leaving old workers that don't have the latest code. This PR rolls back to an older version of the daemons gem and switches us to use an explicit stop and start.
Links
Testing story
Deployment strategy
Follow-up work
Privacy
Security
Caching
PR Checklist:
- [ ] Tests provide adequate coverage
- [ ] Privacy and Security impacts have been assessed
- [ ] Code is well-commented
- [ ] New features are translatable or updates will not break translations
- [ ] Relevant documentation has been added or updated
- [ ] User impact is well-understood and desirable
- [ ] Pull Request is labeled appropriately
- [ ] Follow-up work items (including potential tech debt) are tracked and linked
@davidsbailey Too much going on today, but I think I'm ready to merge this any time. Ready for re-review.
Looks good! I'm pretty slammed this week, so my personal vote would be to hold off on launching this until Monday.
🟢 green light on my end to go ahead and merge this
This appears to be working 👍 1am UTC (6pm yesterday) corresponds to the time of the last DTP:
ubuntu@production-daemon:~$ ps -eo pid,lstart,cmd | grep delay
1768893 Thu May 23 01:09:19 2024 delayed_job.0
1768899 Thu May 23 01:09:20 2024 delayed_job.1
1768905 Thu May 23 01:09:21 2024 delayed_job.2
1768911 Thu May 23 01:09:23 2024 delayed_job.3
1768917 Thu May 23 01:09:24 2024 delayed_job.4
1768923 Thu May 23 01:09:25 2024 delayed_job.5
1768929 Thu May 23 01:09:26 2024 delayed_job.6
1768935 Thu May 23 01:09:27 2024 delayed_job.7
1768941 Thu May 23 01:09:28 2024 delayed_job.8
1768947 Thu May 23 01:09:29 2024 delayed_job.9
2058273 Thu May 23 16:57:07 2024 grep --color=auto delay