thrift_client icon indicating copy to clipboard operation
thrift_client copied to clipboard

User-defined exceptions handled incorrectly with retries, may be replaced by NoServersAvailable

Open divtxt opened this issue 11 years ago • 2 comments

When the retries option is non-zero, an exception thrown on the server results in the client incorrectly throwing a NoServersAvailable exception instead of the correct exception.

With retries set to 0, the correct exception is thrown:

/Users/div/w/web/hello_service.rb:27:in `recv_pong': bar! (Hello::BadArgumentException)
    from /Users/div/w/web/hello_service.rb:17:in `pong'
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:169:in `block in handled_proxy'
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:150:in `ensure_socket_alignment'
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:169:in `handled_proxy'
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:64:in `pong'
    from pong.rb:36:in `<main>'

With retries set to 1 or higher, this changes to the misleading exception:

/Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:224:in `no_servers_available!': No live servers in [127.0.0.1:10002]. (ThriftClient::NoServersAvailable)
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:145:in `next_live_server'
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:101:in `connect!'
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:163:in `handled_proxy'
    from /Users/div/.rvm/gems/ruby-1.9.3-p194@pagerduty/bundler/gems/thrift_client-b8ac8c65220e/lib/thrift_client/abstract_thrift_client.rb:64:in `pong'
    from pong.rb:36:in `<main>'

Client code:

require 'thrift'
require 'thrift_client'
require_relative "hello_service"

servers = ['127.0.0.1:10002']
options = {
      transport_wrapper: Thrift::FramedTransport,
      connect_timeout: 4,
      timeout: 45,
      retries: 0
}

thrift_client = ThriftClient.new(
    Hello::HelloService::Client, servers, options
)

thrift_client.pong

Thrift API:

namespace java com.example.hello
namespace rb Hello

exception BadArgumentException {
  1: string message
}

service HelloService {
  string pong() throws (1: BadArgumentException a);
}

This is against the 0.9.2 gem.

divtxt avatar Jul 24 '14 23:07 divtxt

On behalf of PagerDuty, I've added a $250 bounty on BountySource for the resolution of this issue: https://www.bountysource.com/issues/3320101-enabling-retries-breaks-exceptions-incorrectly-throws-noserversavailable

chrisgagne avatar Jan 06 '15 19:01 chrisgagne

It looks like this error occurs when the number of retries > number of servers in the list. Otherwise retries number of calls are made before the user defined exception is thrown.

For example, with servers = ['127.0.0.1:10002', '127.0.0.1:11002'], the behavior is:

  • retries=0 - calls one server, and then raises BadArgumentException
  • retries=1 - calls both servers, and then raises BadArgumentException
  • retries=2 - calls both servers and then raises NoServersAvailable

In all cases, the correct behavior should be to raise the BadArgumentException sent by the first server called.

divtxt avatar Feb 11 '15 10:02 divtxt