rdkafka-ruby
rdkafka-ruby copied to clipboard
Map and deal with `RD_KAFKA_RESP_ERR__STATE` when using `store_offset` directly
As of librdkafka v.1.9.0 release that is on version v0.12.0 we started noticing a increase occurrence of the mentioned RD_KAFKA_RESP_ERR__STATE when using store_offset directly: https://github.com/appsignal/rdkafka-ruby/blob/c02f217a189b71e3e253fe4ad59286b5fc9d4034/lib/rdkafka/consumer.rb#L325
This is error is not currently mapped: https://github.com/appsignal/rdkafka-ruby/blob/c02f217a189b71e3e253fe4ad59286b5fc9d4034/lib/rdkafka/bindings.rb#L23
or dealt with by rdkafka-ruby in ☝️ store_offset method.
Note: we catch and log these right now but we decided to fix the used version to < 0.12.0 for safety.
We also used a wapper so we can catch the errors without checking the response everywhere, like the following (not the actual code, but close):
class RdkafkaErrors::ErroneousState < Rdkafka::RdkafkaError
# From `RD_KAFKA_RESP_ERR__STATE` in https://github.com/edenhill/librdkafka/blob/b871fdabab84b2ea1be3866a2ded4def7e31b006/src/rdkafka.h#L340
def self.===(exception)
exception.rdkafka_response == -172
end
end
and then we can:
rescue RdkafkaErrors::ErroneousState => e
Correct me if I'm wrong, but this is already remapped and works as other errors, for example RD_KAFKA_RESP_ERR__ASSIGNMENT_LOST that also should be handled by the end users.
I can see it being used in karafka and racecar for a while already:
https://github.com/karafka/karafka/blob/master/lib/karafka/connection/client.rb#L254
I'd be open to generating exception subclasses for various errors, if we can come up with a way to make that maintainable.
closing. It is being remapped here: https://github.com/karafka/rdkafka-ruby/blob/main/lib/rdkafka/consumer.rb#L406 and I also did not see any reply from @jvortmann for a year on that.
Just to clarify the original intent if this happens to be reopen or re-evaluated:
The intent was to help users of Rdkafka to on specific exceptions for errors coming from the underlying lid without the need to rescue on a generic RdkafkaErrors and them check each particular response.
So instead of
begin
# some code
rescue RdkafkaError => e
if e.rdkafka_response == -172 # do something (not clear here what a -172 represents)
# deal with Errouneous State
else
#check for other error in rdkafka_response
end
end
The user would do something like this
begin
# some code
rescue RdkafkaErrors::ErroneousState => e
# deal with Errouneous State
rescue RdkafkaErrors::ConnectionClosed => e
# deal with Connection Closed
end
@thijsc I believe the mapping to error code => error does not change frequently, so it would be a matter of mapping existing RD_KAFKA_RESP_ERR__* to classes with def self.===(exception) matching the error number.
Then in https://github.com/karafka/rdkafka-ruby/blob/main/lib/rdkafka/consumer.rb#L406 a factory would take care of raising the specific one, something like:
raise Rdkafka::RdkafkaErrorFactory.from(response)
and the factory in a simple example (could be more clever instead of ifs) in the real scenario:
class Rdkafka::RdkafkaErrorFactory
def self.from(response)
if response == -172
RdkafkaErrors::ErroneousState.new(response)
...
end
@mensfeld we ended mapping the ones the rescued from internally and I missed this thread, sorry.
Let me know what you guys think.
Few comments below:
begin
# some code
rescue RdkafkaError => e
if e.rdkafka_response == -172 # do something (not clear here what a -172 represents)
# deal with Errouneous State
else
#check for other error in rdkafka_response
end
end
you don't have to check the response code, you can use code:
begin
# some code
rescue RdkafkaError => e
if e.code == :erroneous_state
# deal with Errouneous State
else
#check for other error in rdkafka_response
end
end
that said I do see your point.
karafka-rdkafka already has an error factory that I plan to backport here and improve. That said I am not sure if this would not break backwards compatibility if we would split errors in inheritance.
Aside from that librdkafka has two types of errors (code and C type based) that are distinct in their behaviours (mostly transactions related), meaning it has to be done well.
I am not saying no but I think it needs to be well planned not to break other people code
Agree