yabeda-prometheus icon indicating copy to clipboard operation
yabeda-prometheus copied to clipboard

WEBrick suddenly shuts down with no errors

Open MarwanTukhta opened this issue 2 years ago • 4 comments

Hello folks, thanks for creating this awesome gem, I am facing an issue with WEBrick on my app, my app runes on passenger, running Yabeda::Prometheus::Exporter.start_metrics_server! works fine and the metrics are scraped by Prometheus just fine, but then with no errors, WEBrick returns this

INFO  going to shutdown ...
INFO  WEBrick::HTTPServer#start done.

and yabeda prometheus no longer works, trying to re-run it again gives me Errno::EADDRINUSE

gems

gem "yabeda-prometheus", "~> 0.8.0"
gem "yabeda-rails", "~> 0.8.1"
gem "yabeda-graphql", "~> 0.2.2"
gem "yabeda-anycable", "~> 0.1.1"

my yabeda_prometheus.rb initializer

Yabeda::Rails.install!
Yabeda::Prometheus::Exporter.start_metrics_server!

Note: I got anycable and sidekiq running on different servers and they work fine

MarwanTukhta avatar Sep 26 '22 17:09 MarwanTukhta

Had the same issue with passenger, from what I can tell I think passenger would kill the thread cause it saw it as a long-running thread did you end up getting this fixed? I was able to fix it for our app by running that server in a separate process and using Process.detach afterwards to make sure it got killed with the main passenger process. I have a PR that should allow you to specify whether you'd like to run in a process or thread in my PR

etsenake avatar Nov 05 '22 23:11 etsenake

I don't know much about passenger, but for now I can recommend to either:

  1. Export metrics from the main application (e.g. via plugging exporter it in via Rackup or Rails routes, maybe with port restriction (see https://github.com/prometheus/client_ruby/pull/199, but yabeda-prometheus doesn't support this option yet).

  2. Explicitly run separate process which will only export metrics from all processes on the same machine and nothing else.

    Example of such exporter process can be found here: https://gist.github.com/Envek/96f297c1dfbac8ae5afa7e4abff78f0b

    Beware that this will require you to set up direct file store in Prometheus client, see https://github.com/prometheus/client_ruby#data-stores (but you have to do this anyway if you use clustered mode)

@MarwanTukhta, sorry for the long wait!

Envek avatar Nov 07 '22 13:11 Envek

Sounds good @Envek I'll try that, thanks a lot 🤝

MarwanTukhta avatar Nov 07 '22 16:11 MarwanTukhta

Hey everyone,

I found the reason for the metrics server sudden shutdown. It was due to how Passenger works if you use smart spawning method (which I think is the default one).

Basically, when you add Yabeda::Prometheus::Exporter.start_metrics_server! to any configuration file (whether it is application.rb, <environment>.rb or config/initializers/<file_name>.rb) it will run once in the first process that Passenger spawns, which is called the preloader process. The code will not run on forked processes, which makes sense because you don't want multiple metrics servers also you don't want to run into port reservation errors (i.e. every process will try to create a thread on the same port, which will result in errors)

But here is the catch, by default the preloader process is shutdown automatically by Passenger after some predefined time passes (5 minutes by default). Then, as a result, the metrics server thread gets killed.

In Passenger Nginx, this behavior is controlled by the directive passenger_max_preloader_idle_time. In our case, we set its value to 0, which means preloader process won't exit.

Adding this change fixed the issue for us.

mabrikan avatar Apr 09 '24 07:04 mabrikan