soci icon indicating copy to clipboard operation
soci copied to clipboard

Bug in PG failover callback processing

Open muratovv opened this issue 6 years ago • 2 comments
trafficstars

Hello, Regarding the last changes in https://github.com/SOCI/soci/pull/733, there was decided to wrap callback calls with try/catch. I investigated that the provided solution is not enough. If reconnection is failed, an exception will be thrown here. The exception breaks attempts to reconnect and finished/aborted notifications will be not invoked.

I see some ways to fix the issue with different outcomes. Below I will provide them to maintainer's judgment.

Possible solutions:

  1. Naive

Just wrap reconnection code with try/catch. Pros - it consistent with the current failover_callback interface. Cons - a client will not be notified about the reason of failure. It could be crucial in debugging, and in situations, where the client uses different connection parameters.

  1. Extend failover_callback interface with exception pointer

Considering the cons of the previous item, if we want to notify the client about problems, it seems reasonable to pass a pointer to the exception in failover callback. Signature of failed will be:

 virtual void failed(bool & /* out */ /* retry */, std::string & /* out */ /* newTarget */, const std::exception * /* last exception */) {}
  1. Remove newTarget from failover_callback::failed interface

It is the most controversial solution, but it could be because newTarget doesn't use in Oracle implementation, and restriction of the interface allows catching reconnection without user notification. Probably, it could be better to restrict the interface for now and extend it later when the functionality of reconnection will be more mature.

I hope we will find a suitable solution ASAP. I will be happy to contribute to the project. Thank you.

muratovv avatar May 24 '19 17:05 muratovv

Perhaps, there could be a connection parameter silent_reconnect=true|false or similar, that client can explicitly set to unsubscribe from any notifications and let to reconnect quietly.

TBH, I have not looked into the problem enough to speak of any preference. I think that an acceptable solution could be any that 1) solves the problem for contributor (you) 2) does not modify overall interface too extensively (strict compatibility not required though) and 3) does not break other backends (tests still pass).

Perhaps @vadz would have more restrictions in mind or more insights.

mloskot avatar Jun 03 '19 10:06 mloskot

I have no idea how is failover_callback supposed to work: it looks to me that whatever it does, this function (i.e. details::postgresql_result::check_for_data()) will still throw at the very end, so what's the point of reconnecting in it?

Unfortunately we have no tests for this, but I'm almost sure that this feature is just broken for PostgreSQL. But perhaps I'm missing something here, of course... @msobczak could you please explain how is this supposed to work?

vadz avatar Jun 04 '19 14:06 vadz