redis-rb
redis-rb copied to clipboard
SystemStackError (stack level too deep) on redis-rb 4.6.0
See this issue randomly on about 1 out of 1 million calls.
I'm using the OpenTelemetry gem with Redis which does a lightweight monkey patch but I don't think it's related.
SystemStackError
vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:37
vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:37:in `read': stack level too deep (SystemStackError)
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:416:in `format_bulk_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:394:in `format_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:384:in `read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `block in format_multi_bulk_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `initialize'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `new'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `format_multi_bulk_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:395:in `format_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:384:in `read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:311:in `block in read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:299:in `io'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:310:in `read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:161:in `block in call'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:279:in `block (2 levels) in process'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:420:in `ensure_connected'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:269:in `block in process'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:356:in `logging'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:268:in `process'
from vendor/bundle/ruby/3.0.0/gems/opentelemetry-instrumentation-redis-0.21.3/lib/opentelemetry/instrumentation/redis/patches/client.rb:46:in `block in process'
from vendor/bundle/ruby/3.0.0/gems/opentelemetry-api-1.0.1/lib/opentelemetry/trace/tracer.rb:29:in `block in in_span'
from vendor/bundle/ruby/3.0.0/gems/opentelemetry-api-1.0.1/lib/opentelemetry/trace.rb:82:in `block in with_span'
from vendor/bundle/ruby/3.0.0/gems/opentelemetry-api-1.0.1/lib/opentelemetry/context.rb:87:in `with_value'
from vendor/bundle/ruby/3.0.0/gems/opentelemetry-api-1.0.1/lib/opentelemetry/trace.rb:82:in `with_span'
from vendor/bundle/ruby/3.0.0/gems/opentelemetry-api-1.0.1/lib/opentelemetry/trace/tracer.rb:29:in `in_span'
from vendor/bundle/ruby/3.0.0/gems/opentelemetry-instrumentation-redis-0.21.3/lib/opentelemetry/instrumentation/redis/patches/client.rb:45:in `process'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:161:in `call'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis.rb:263:in `block in send_command'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis.rb:262:in `synchronize'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis.rb:262:in `send_command'
Hum, would you have a more complete backtrace? Because I can't tell where it's recursing from this one.
Here's another backtrace. It looks pretty clear there's a loop here:
vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:416:in `format_bulk_reply': stack level too deep (SystemStackError)
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:394:in `format_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:384:in `read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `block in format_multi_bulk_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `initialize'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `new'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `format_multi_bulk_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:395:in `format_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:384:in `read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `block in format_multi_bulk_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `initialize'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `new'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:425:in `format_multi_bulk_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:395:in `format_reply'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/connection/ruby.rb:384:in `read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:311:in `block in read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:299:in `io'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:310:in `read'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:161:in `block in call'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:279:in `block (2 levels) in process'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:420:in `ensure_connected'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:269:in `block in process'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:356:in `logging'
from vendor/bundle/ruby/3.0.0/gems/redis-4.6.0/lib/redis/client.rb:268:in `process'
@johnnyshields the trace you shared shows three calls to read and format_reply. That wouldn't cause a SystemStackError. Some limited recursion is normal when the data being operated on is changing (like in a multi bulk reply here).
In order to help you figure out why the stack level is too deep, can you share more of the stack? A blown stack like this will show something repeating hundreds+ of times.
I'm capturing this in my bug tracker tool (Bugsnag) and this is all it gives me. I'm not able to replicate this outside production. I'll take a look if I can configure the tool to get a deeper trace for this error.
Closing as non-actionable.
This whole code has been rewritten in 5.x anyway, so I assume whatever the bug was, it's probably no longer present.