stringer
stringer copied to clipboard
Improve robustness of rake work_jobs
First of all, many thanks on creating this awesome RSS reader! Just what I was looking for and was about to create myself, but Stringer is so much better than anything I wanted :)
I installed it into Heroku and noticed a problem with reliability of the Heroku-specific rake lazy_fetch
approach:
While the bundle exec rake work_jobs
is launched from config/unicorn.rb, it's never checked to be running and its delayed_job_pid is never used. It seems that Heroku kills long-running database connections either on purpose, or just randomly, which also causes an unhandled exception to kill the bundle exec rake work_jobs
process (example log). While rake lazy_fetch
continues to submit jobs, the worker process is dead and never restarted.
It seems the best way would be to run rake work_jobs
from a Procfile. I've tried two other workarounds and can confirm they work:
- Restart the rake process in bash:
diff --git config/unicorn.rb config/unicorn.rb
--- config/unicorn.rb
+++ config/unicorn.rb
@@ -9,7 +9,7 @@ before_fork do |_server, _worker|
# as there's no need for the master process to hold a connection
ActiveRecord::Base.connection.disconnect! if defined?(ActiveRecord::Base)
- @delayed_job_pid ||= spawn("bundle exec rake work_jobs")
+ @delayed_job_pid ||= spawn("while true; do bundle exec rake work_jobs; sleep 60; done")
sleep 1
end
- Catch exceptions in the work_jobs job and try to reconnect:
diff --git Rakefile Rakefile
--- Rakefile
+++ Rakefile
@@ -46,11 +46,25 @@ desc "Work the delayed_job queue."
task :work_jobs do
Delayed::Job.delete_all
- 3.times do
- Delayed::Worker.new(
- min_priority: ENV["MIN_PRIORITY"],
- max_priority: ENV["MAX_PRIORITY"]
- ).start
+ logger = Logger.new(STDOUT)
+ logger.level = Logger::DEBUG
+
+ loop do
+ begin
+ logger.info "(Re-)starting worker"
+ if !ActiveRecord::Base.connection.active?
+ logger.info "Restarting DB connection"
+ ActiveRecord::Base.connection.reconnect!
+ end
+ Delayed::Worker.new(
+ min_priority: ENV["MIN_PRIORITY"],
+ max_priority: ENV["MAX_PRIORITY"]
+ ).start
+ rescue => e
+ logger.error e.message
+ e.backtrace.each { |line| logger.error line }
+ sleep 60
+ end
end
end
Yeah lazy_fetch
is a big hack. The correct way, as you mentioned, is to run a separate dyno to work jobs -- but this means we can't run on the free plan (only 1 dyno).
Were the exceptions stopping your instance from updating the feeds? I agree that the robustness is not ideal, but I've not experience issues where my heroku instance stopped updating.
Actually, it seems that Heroku counts a pair of worker + web as one "dyno":
- https://devcenter.heroku.com/articles/dyno-sleeping
- https://www.heroku.com/pricing
I run some other apps on the free tier with a worker + web combo and it works without major problems.
In my case, the worker process stopped after 10-40 minutes, which also stopped updating feeds. The only way to restart it was to wait until the web dyno sleeps and then restarts itself and the worker. It looked something like:
- Open Stringer, starting the web and worker processes
- Worker process crashes
- Timer submits job, but worker is no longer active
- Open Stringer again, nothing happens (worker is still crashed)
- Wait until the dyno sleeps
- Open Stringer again, which clears the job queue and starts the worker
- If in luck, the timer will submit the jobs for the worker before it crashes
Maybe it was just a bad day and Heroku was installing security patches or something...
I also had an hourly timer, maybe it submitted jobs when the dynos were sent to sleep. Setting it to run every 10 minutes seems to have improved things a bit.