jamdb_oracle
jamdb_oracle copied to clipboard
Connection pool is exhausted, disconnects hang forever.
We are using 0.5.0 of the library, via Ecto.
We have an application that makes a query at the beginning of an HTTP request.
For whatever reason, making a number of requests in quick succession (not how our users are doing this, but it was how I could repro an error state) can cause connections in the connection pool to get stuck in the disconnection process.
I think this is because of https://github.com/erlangbureau/jamdb_oracle/blob/master/src/jamdb_oracle.erl#L25
I would like to lobby that we change the call timeout to be either configurable or not infinity. If the process for disconnection takes too long, I would prefer to "let it crash". My application will reconnect, and the Oracle DBAs downstream can handle a couple of ungraceful shutdowns, I'd argue.
I can make this PR, but if I do, it would likely be that I'd set the timeout to be 5000 milliseconds.
Hi,
You can specify timeout in config:
config :friends, Friends.Repo, database: "friends", username: "user", password: "pass", hostname: "localhost", timeout: 500
Value will be passed on to connect fun: https://github.com/erlangbureau/jamdb_oracle/blob/master/lib/jamdb_oracle.ex#L70 https://github.com/erlangbureau/jamdb_oracle/blob/master/src/jamdb_oracle_conn.erl#L33
Try to find problem line by adding erlang:display to disconnect fun in jamdb_oracle_conn.erl
-spec disconnect(state()) -> {ok, [env()]}.
disconnect(State) ->
erlang:display(100),
disconnect(State, 1).
-spec disconnect(state(), timeout()) -> {ok, [env()]}.
disconnect(#oraclient{socket=Socket, env=Env, passwd=Passwd}, 0) ->
erlang:display(101),
sock_close(Socket),
erlang:display(102),
exit(Passwd, ok),
erlang:display(103),
{ok, Env};
disconnect(#oraclient{conn_state=connected, socket=Socket, env=Env, passwd=Passwd} = State, _Tout) ->
erlang:display(104),
_ = send_req(close, State),
erlang:display(105),
sock_close(Socket),
erlang:display(106),
exit(Passwd, ok),
erlang:display(107),
{ok, Env};
disconnect(#oraclient{env=Env}, _Tout) ->
erlang:display(108),
{ok, Env}.
Infinite timeouts are a bug in Erlang. You should not use them in your library when you have user provided timeouts already passed in.
In our case Timeout is mostly used in gen_tcp and has less effect in gen_server https://www.erlang.org/doc/man/gen_tcp.html#recv-3 https://www.erlang.org/doc/man/gen_server.html#call-2
In gen_server:call default timeout is 5000 Here timeout will be 5000
- call_infinity(Pid, stop).
+ gen_server:call(Pid, stop).
Btw, you can change timeout anytime calling internal command TIMEOUT https://github.com/erlangbureau/jamdb_oracle/blob/master/src/jamdb_oracle_conn.erl#L124
Ecto.Adapters.SQL.query(YourApp.Repo, "timeout", [5000])
I’m trying to tell you that your library isn’t using it right. I have connection pool processes that are hanging in the disconnection function because of your timeout on infinity
I know this is the case because the only change I made to fix this is the PR I’ve submitted
Is this enough?
- call_infinity(Pid, stop).
+ gen_server:call(Pid, stop).
You need to be able to pass in the user provided timeout, as well.
Is timeout in config not working properly?
config :friends, Friends.Repo, database: "friends", username: "user", password: "pass", hostname: "localhost", timeout: 500
It should be passed into the sql_query function as well. And then the gen_server call.
On Nov 18, 2022, at 2:49 AM, Mykhailo Vstavskyi @.***> wrote:
Is timeout in config not working properly?
config :friends, Friends.Repo, database: "friends", username: "user", password: "pass", hostname: "localhost", timeout: 500
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.
Please try stage branch commit
Why make all of these commits when I have a PR already open for this?
I'd like to close the issue with disconnect. Can you give me the working example? Disconnect on L82 doesn't work for me. https://github.com/erlangbureau/jamdb_oracle/blob/master/lib/jamdb_oracle.ex#L82
MyRepo.disconnect_all(1000)
DBConnection.disconnect_all(pid, 1000)
L82 is never reached.
Jamdb.Oracle.disconnect(nil, pid)
** (FunctionClauseError) no function clause matching in DBConnection.ConnectionPool.handle_call/3
Modified function is not working
def disconnect(_err, s) do
query(s, 'CLOSE')
:ok
end
This is working
Ecto.Adapters.SQL.query(YourApp.Repo, "close", [])
My PR is to close the issue
I used my actual working code to verify the change
@vstavskyi Your example about the disconnect/2
function doesn't make sense to me. That code doesn't exist in master
or in my PR.
Your changes to propagate the connect timeout still doesn't respect the individual user's timeout parameter. I have updated my PR so that it can merge with master
.