ngx_mruby
ngx_mruby copied to clipboard
Periodic redis errors
I'm using ngx_mruby to do dynamic lets encrypt ssl resolution. Currently building from source against nginx 1.16.0 and ngx_mruby 2.1.5.
Here's the relevant parts of the configuration:
# /etc/nginx/nginx.conf
http {
# ...
include /etc/nginx/conf.d/*.conf;
mruby_init_worker_code '
userdata = Userdata.new
redis_url = "redis://my.redis.url:6379"
redis_host, redis_port = redis_url[/redis:\/\/(.+)/, 1].split(":")
userdata.redis = Redis.new redis_host, redis_port.to_i
userdata.redis.select 2
';
}
# /etc/nginx/conf.d/app.conf
# ...
server {
listen 443 ssl;
# ...
mruby_ssl_handshake_handler_code '
ssl = Nginx::SSL.new
domain = ssl.servername
redis = Userdata.new.redis
ssl_certificate = redis["#{domain}.crt"]
ssl_key = redis["#{domain}.key"]
if ssl_certificate && ssl_certificate != "" && ssl_key && ssl_key != ""
ssl.certificate_data = ssl_certificate
ssl.certificate_key_data = ssl_key
end
';
# ...
}
After roughly 45 minutes of running ngx_mruby, i start to see these redis connection failures:
2019/08/25 20:53:03 [error] 23164#0: *17960 ngx_mruby : mrb_run failed: return 500 HTTP status code to client: error: INLINE CODE:6: could not read reply (Redis::ConnectionError) while SSL handshaking, client: 122.36.17.229, server: 0.0.0.0:443
If I reload nginx, the errors stop. If I let the server run for about 45 minutes, they will inevitably return. Right now I'm avoiding these by reloading nginx every 15 minutes on a cron job, but it seems like there's some sort of issue with the embedded ruby code that causes the redis connection to stop working eventually. Maybe the embedded ruby code is leaving redis connections hanging or something.
The server this running on has unlimited ulimit and the number of open redis connections has never exceeded a couple hundred.
Has anyone else seen these types of errors?
Thank you for you report! I'll investigate the issue.
Something else I've noticed is that the existing nginx workers take an unusually long amount of time to shut down when I run a reload. Like a good 30 seconds, and the request load is very light.
@matsumotory Any ideas on this?
Sorry for my late response. I think Redis client timed out. So Could you implement reconnecting Redis when catching timeout error like Redis::ConnectionError?
Yes, I can try. I can try wrapping the mruby_ssl_handshake_handler_code in a begin/rescue. I'm not sure how/why a reconnect would do anything there, but I'll try and report back.
Yes, it's strange behavior. Maybe OS or middleware layer settings disconnects the TCP connection I guess.
@matsumotory Is there a reconnect interface or method on the mruby-redis client?
@ajsharp @matsumotory I have same issue, is this problem solve? thank you all
Yes, it's strange behavior. Maybe OS or middleware layer settings disconnects the TCP connection I guess.
@matsumotory We're using the same redis server (AWS elasticache) in our main app from the same host machine without connection issues, so the problem is almost definitely in this library in some way. It's unclear whether the problem is the redis client this project uses or some lower level issue with the nginx extension. However, it's not an OS/middleware issue b/c we're maintaining persistent connections to the same redis server from the same host machine without any problems.
@afunction The only "fix" we've found is to restart every 8 mins or so. It seems like there's some sort of redis connection issue.
When I tried to diagnose this before, I started looking through the redis client this project bundles, but I don't have the expertise or the time to get into the C++ code. It would be great if it was possible to add additional logging instrumentation around the connection logic.
@ajsharp @afunction I'm wondering if you guys can move Redis.new into mruby_ssl_handshake_handler_code? I don't think redis[], mrb_redis_get() in C, supports reconnect.
It looks like _redisContextConnectTcp() in hredis supports it, but it is only called via redisConnectWithTimeout() from the following methods.
- initialize(): mrb_redis_connect() in C
- connect_set_raw(): mrb_redis_connect_set_raw() in C
- connect_set_udptr(): mrb_redis_connect_set_raw() in C
Disclaimer: I have never used redis. I just looked at mruby-redis and hired code while waiting for windows update.
@yyamano Thank you for your suggestion.
@ajsharp @afunction Could you try to use Redis#enable_keepalive in doc ?
Or you can implement error handling like this example.
@matsumotory I tried using #enable_keepalive and it didn't help -- still getting the errors. I will try the begin rescue you linked to and report back.
I've put begin/rescue blocks in both the mruby_ssl_handshake_handler_code block and the mruby_init_worker_code block. Errors are only being generated from the handshake handler block. Here's my code:
begin
ssl = Nginx::SSL.new
domain = ssl.servername
redis = Userdata.new.redis
ssl_certificate = redis["#{domain}.crt"]
ssl_key = redis["#{domain}.key"]
if ssl_certificate && ssl_certificate != "" && ssl_key && ssl_key != ""
ssl.certificate_data = ssl_certificate
ssl.certificate_key_data = ssl_key
end
rescue Redis::ReplyError => e
STDERR.puts "HANDSHAKE REPLY ERROR: #{e}"
rescue Redis::ConnectionError => e
STDERR.puts "HANDSHAKE CONN ERROR: #{e}"
end
I'm seeing lots of Redis::ConnectionError exceptions in the nginx error logs. Unfortunately, all the reply says is: "could not read reply"
@matsumotory Is it possible to add more context here to the error message?
https://github.com/matsumotory/mruby-redis/blob/18b3d06cf238b26eb921e019fff1a9aa9ad22e95/src/mrb_redis.c#L1462
@ajsharp Thank you for your details. I have implemented your suggestion on https://github.com/matsumotory/mruby-redis/pull/102 . Could you try this?
@ajsharp Do you still have the problem with the newer version of mruby-redis ?
@yyamano I will try an upgrade and let you know. Thanks for checking in on this.
@yyamano Is the latest version of mruby-redis installed with the latest version of this library?
@ajsharp If you rebuild ngx_mruby, the latest version of mruby-redis including the change is installed. You can check build_config.rb.lock to make sure if the latest version is installed. Mine is:
https://github.com/matsumotory/mruby-redis.git:
url: https://github.com/matsumotory/mruby-redis.git
branch: HEAD
commit: f1f98d9450783b8c281b5064554b12bfb9f0a65a
version: 0.0.1