rdkafka-ruby icon indicating copy to clipboard operation
rdkafka-ruby copied to clipboard

Map and deal with `RD_KAFKA_RESP_ERR__STATE` when using `store_offset` directly

Open jvortmann opened this issue 3 years ago • 2 comments

As of librdkafka v.1.9.0 release that is on version v0.12.0 we started noticing a increase occurrence of the mentioned RD_KAFKA_RESP_ERR__STATE when using store_offset directly: https://github.com/appsignal/rdkafka-ruby/blob/c02f217a189b71e3e253fe4ad59286b5fc9d4034/lib/rdkafka/consumer.rb#L325

This is error is not currently mapped: https://github.com/appsignal/rdkafka-ruby/blob/c02f217a189b71e3e253fe4ad59286b5fc9d4034/lib/rdkafka/bindings.rb#L23

or dealt with by rdkafka-ruby in ☝️ store_offset method.

jvortmann avatar Aug 19 '22 23:08 jvortmann

Note: we catch and log these right now but we decided to fix the used version to < 0.12.0 for safety.

We also used a wapper so we can catch the errors without checking the response everywhere, like the following (not the actual code, but close):

class RdkafkaErrors::ErroneousState < Rdkafka::RdkafkaError
  # From `RD_KAFKA_RESP_ERR__STATE` in https://github.com/edenhill/librdkafka/blob/b871fdabab84b2ea1be3866a2ded4def7e31b006/src/rdkafka.h#L340
  def self.===(exception)
    exception.rdkafka_response == -172
  end
end

and then we can:

rescue RdkafkaErrors::ErroneousState => e

jvortmann avatar Aug 20 '22 00:08 jvortmann

Correct me if I'm wrong, but this is already remapped and works as other errors, for example RD_KAFKA_RESP_ERR__ASSIGNMENT_LOST that also should be handled by the end users.

I can see it being used in karafka and racecar for a while already:

https://github.com/karafka/karafka/blob/master/lib/karafka/connection/client.rb#L254

mensfeld avatar Aug 20 '22 07:08 mensfeld

I'd be open to generating exception subclasses for various errors, if we can come up with a way to make that maintainable.

thijsc avatar Nov 01 '22 11:11 thijsc

closing. It is being remapped here: https://github.com/karafka/rdkafka-ruby/blob/main/lib/rdkafka/consumer.rb#L406 and I also did not see any reply from @jvortmann for a year on that.

mensfeld avatar Dec 27 '23 08:12 mensfeld

Just to clarify the original intent if this happens to be reopen or re-evaluated:

The intent was to help users of Rdkafka to on specific exceptions for errors coming from the underlying lid without the need to rescue on a generic RdkafkaErrors and them check each particular response.

So instead of

begin
  # some code
rescue RdkafkaError => e
  if e.rdkafka_response  == -172 # do something (not clear here what a -172 represents)
    # deal with Errouneous State
  else
    #check for other error in rdkafka_response
  end
end

The user would do something like this

begin
  # some code
rescue RdkafkaErrors::ErroneousState => e
    # deal with Errouneous State
rescue RdkafkaErrors::ConnectionClosed => e
   # deal with Connection Closed
end

@thijsc I believe the mapping to error code => error does not change frequently, so it would be a matter of mapping existing RD_KAFKA_RESP_ERR__* to classes with def self.===(exception) matching the error number. Then in https://github.com/karafka/rdkafka-ruby/blob/main/lib/rdkafka/consumer.rb#L406 a factory would take care of raising the specific one, something like:

raise Rdkafka::RdkafkaErrorFactory.from(response)

and the factory in a simple example (could be more clever instead of ifs) in the real scenario:

class Rdkafka::RdkafkaErrorFactory
  def self.from(response)
    if response == -172
      RdkafkaErrors::ErroneousState.new(response)
    ...
  end

@mensfeld we ended mapping the ones the rescued from internally and I missed this thread, sorry.

Let me know what you guys think.

jvortmann avatar Feb 06 '24 17:02 jvortmann

Few comments below:

begin
  # some code
rescue RdkafkaError => e
  if e.rdkafka_response  == -172 # do something (not clear here what a -172 represents)
    # deal with Errouneous State
  else
    #check for other error in rdkafka_response
  end
end

you don't have to check the response code, you can use code:

begin
  # some code
rescue RdkafkaError => e
  if e.code  == :erroneous_state
    # deal with Errouneous State
  else
    #check for other error in rdkafka_response
  end
end

that said I do see your point.

karafka-rdkafka already has an error factory that I plan to backport here and improve. That said I am not sure if this would not break backwards compatibility if we would split errors in inheritance.

Aside from that librdkafka has two types of errors (code and C type based) that are distinct in their behaviours (mostly transactions related), meaning it has to be done well.

I am not saying no but I think it needs to be well planned not to break other people code

mensfeld avatar Feb 07 '24 10:02 mensfeld

Agree

jvortmann avatar Feb 12 '24 23:02 jvortmann