rdkafka-ruby icon indicating copy to clipboard operation
rdkafka-ruby copied to clipboard

Support jruby

Open mensfeld opened this issue 1 year ago • 4 comments

I run tests on most recent jruby stable and FFI aside from one case (fixable) works ok. We could rollout the jruby support easily.

mensfeld avatar Oct 07 '24 15:10 mensfeld

I look forward to playing with benchmarks and seeing how we can make the JRuby support superfast!

headius avatar Oct 07 '24 17:10 headius

Things we need to do to make this happen:

  • [ ] Fix segfaults related to oauth bearear callback assignments
  • [ ] Ignore fork specs for jruby
  • [ ] Benchmark performance and check if it makes sense (at least 50k msg/s passthrough, ideally more than 100k)

mensfeld avatar Oct 08 '24 09:10 mensfeld

oauth bearear callback assignments

FFI? There's at least one known JRuby issue with FFI callbacks on Apple Silicon that requires some low-level C work (probably by me 😭).

headius avatar Oct 08 '24 20:10 headius

@headius, the thing is, other callbacks work as expected, so I think it's rather a misuse/misconfiguration. I'll get back to you next week when I look into this with more info.

mensfeld avatar Oct 10 '24 09:10 mensfeld

Revisiting this because it came up today...

Still crashes on my M1 MBA during specs:

...
  oauthbearer set token
    without args
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000012a612ccc, pid=44035, tid=9219
#
# JRE version: OpenJDK Runtime Environment Zulu21.38+21-CA (21.0.5+11) (build 21.0.5+11-LTS)
# Java VM: OpenJDK 64-Bit Server VM Zulu21.38+21-CA (21.0.5+11-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64)
# Problematic frame:
# C  [librdkafka.dylib+0x11eccc]  rd_kafka_oauthbearer_set_token0+0x28

Weirdly enough, if I force JIT in JRuby, it does not crash. There are some peculiar warnings, though:

  oauthbearer set token
    without args
      should raise argument error
    with args
%5|1741803010.316|CONFWARN|rdkafka#consumer-51| [thrd:app]: No `bootstrap.servers` configured: client will not be able to connect to Kafka cluster
      should set token or capture failure
  oauthbearer set token failure
    without args
      should fail
    with args
%5|1741803010.322|CONFWARN|rdkafka#consumer-52| [thrd:app]: No `bootstrap.servers` configured: client will not be able to connect to Kafka cluster
      should succeed
  oauthbearer callback
    without an oauthbearer callback
      should do nothing
    with an oauthbearer callback
%5|1741803010.326|CONFWARN|rdkafka#consumer-53| [thrd:app]: No `bootstrap.servers` configured: client will not be able to connect to Kafka cluster
      should call the oauth bearer callback and receive config and client name

Final result of this forced-JIT run:

Failures:

  1) Rdkafka::Admin when operating from a fork expect to be able to create topics and run other admin operations without hanging
     Failure/Error:
       pid = fork do
         admin
           .create_topic(topic_name, topic_partition_count, topic_replication_factor)
           .wait
       end
     
     NotImplementedError:
       fork is not available on this platform
     # ./spec/rdkafka/admin_spec.rb:745:in 'block in <main>'
     # ./spec/spec_helper.rb:154:in 'block in <main>'
     # ./spec/spec_helper.rb:153:in 'block in <main>'

  2) Rdkafka::Config logger expect to start new logger thread after fork and work
     Failure/Error:
       pid = fork do
         $stdout.reopen(writer)
         Rdkafka::Config.logger = Logger.new($stdout)
         reader.close
         producer = rdkafka_producer_config(debug: 'all').producer
         producer.close
         writer.close
         sleep(1)
       end
     
     NotImplementedError:
       fork is not available on this platform
     # ./spec/rdkafka/config_spec.rb:39:in 'block in <main>'
     # ./spec/spec_helper.rb:154:in 'block in <main>'
     # ./spec/spec_helper.rb:153:in 'block in <main>'

Finished in 5 minutes 18 seconds (files took 3.66 seconds to load)
362 examples, 2 failures, 1 pending

Not bad (and obviously fork specs should just be skipped), but we obviously need to figure out that crash.

headius avatar Mar 12 '25 18:03 headius

Yeah I do recall this crash on oauth. WHen you disable oauth bearer, is the rest of the specs working as expected?

mensfeld avatar Mar 13 '25 09:03 mensfeld

Btw @headius we can work on this somewhere around RubyKaigi if you fancy.

mensfeld avatar Mar 13 '25 09:03 mensfeld

I can reproduce:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007e3fee1c9d32, pid=15776, tid=15807
#
# JRE version: OpenJDK Runtime Environment (21.0.6+7) (build 21.0.6+7-Ubuntu-124.04.1)
# Java VM: OpenJDK 64-Bit Server VM (21.0.6+7-Ubuntu-124.04.1, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [librdkafka.so+0x1c9d32]  rd_kafka_oauthbearer_set_token0+0x12
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/mencio/Software/Karafka/rdkafka-ruby/core.15776)
#
# An error report file with more information is saved as:
# /home/mencio/Software/Karafka/rdkafka-ruby/hs_err_pid15776.log
[43.535s][warning][os] Loading hsdis library failed
#
# If you would like to submit a bug report, please visit:
#   https://bugs.launchpad.net/ubuntu/+source/openjdk-21
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#



mensfeld avatar Apr 16 '25 08:04 mensfeld

bundle exec rspec spec/rdkafka/bindings_spec.rb:152

mensfeld avatar Apr 16 '25 08:04 mensfeld

@headius you know when actions will get 10.0.0.0? ref https://github.com/karafka/rdkafka-ruby/actions/runs/14488191323/job/40638287789?pr=571

mensfeld avatar Apr 16 '25 08:04 mensfeld

ref https://github.com/karafka/karafka/issues/2549

mensfeld avatar Apr 16 '25 08:04 mensfeld