epgsql
epgsql copied to clipboard
Crash in parsing state on {$2, <<>>} message
I ran into a crash today where the pgsql_connection is in the parsing state, and from the sasl logs I see that reply={error, timeout} in the state record. It crashes because it receives the message {$2, <<>>} and that is not a valid message while in the parsing state.
I found it interesting that when parsing times out it continues on in the parsing state:
parsing(timeout, State) ->
#state{timeout = Timeout} = State,
Reply = {error, timeout},
send(State, $S, []),
{next_state, parsing, State#state{reply = Reply}, Timeout};
Why does it not go to the timeout state like querying?
querying(timeout, State) ->
#state{sock = Sock, timeout = Timeout, backend = {Pid, Key}} = State,
pgsql_sock:cancel(Sock, Pid, Key),
{next_state, timeout, State, Timeout};
Perhaps this is related. I've been getting the following error as well:
> pgsql:equery(<0.61.0>, <<"delete from test where id=$1 and key=$2;">> , [ <<"id1">> , <<"key1">> ] ).
call to undefined function
pgsql_connection:parsing({parse,[],<<"delete from test where id=$1 and key=$2;">>,[]},
So I added parsing/3 to pgsql_connection, and this is what it logged for the 3 arguments:
* {parse,[],<<"delete from test_table where id=$1 and key=$2;">>,[]}
* From
* State
I am also using epgsql_pool. Incidentally, I wasn't getting timeout on individual queries, but when trying over a list comprehension of 10,100, I was able to reproduce the timeout ( on all nodes ). My solution was to return_connection everytime i get_connection.
Do you have any easily foreseen reasons why querying will timeout ?
Update: This issue could be similiar to https://github.com/zotonic/zotonic/issues/23 where they diagnosed this as
@arjan : "I think that this is a problem where a single connection is used by two processes."
~B
Hi, sorry for the delayed response!
@tsloughter are you able to reproduce your problem? A minimal test case would be a huge help in diagnosing the issue. The reason it stays in the parsing state is because it sends a sync command and waits for the response before sending a reply to the client.
@bosky101 your issue doesn't look related to timeouts? It does look similar to zotonic's issue which was multiple processes attempting to use a single connection, which won't work.
@wg Appreciate you taking time to respond. Yes, the zotonic issue helped. I was spawning a process before run_equery. So when i changed this to something less asynchronous, my error was resolved.
Here are some additional stats I logged for testing N writes with various connection handling techniques...
%% Test(N) performs N inserts
%%closing connection each time
([email protected])15> Test(10).
67931
([email protected])16> Test(100).
703595
([email protected])17> Test(1000).
7209715
([email protected])18> Test(10000).
76474163
%%returning connection each time instead of closing
([email protected])22> Test(10).
13726
([email protected])23> Test(100).
114265
([email protected])24> Test(1000).
990209
([email protected])25> Test(10000).
10014166
%% re-using connection ( neither closing nor returning )
([email protected])7> Test(10).
9201
([email protected])8> Test(100).
88039
([email protected])9> Test(1000).
880612
([email protected])10> Test(10000).
9582847
@wg Some Quick questions, if you don't mind...
-
I noticed that lot of repo's that were using epgsql_pool to get_connection and store it in a #state.conn. Will re-using connection in this fashion be succeptable to the zotonic 'two processes, same connection" race conditions?
-
Will these gen_server's have to further implement monitoring 'DOWN' ? ( Else the #state{} becomes stale when the connection is lost. but i haven't seen anyone implement this. eg:pool_boy, other odbc wrappers around epgsql_pool )
-
What happens when 1 query is in the parsing state of a connection, when another query then enters the query. Is it better to wrap such equery's between get_connection & return_connection ?
~B
@bosky101 if a connection is stored in a process's state then I wouldn't think any other process would have access to it. They definitely would have to handle connection loss though.
I don't write much erlang anymore, but my preference was for creating pools of data accessor processes, each owning a connection. This seemed more in the spirit of erlang's concurrency model vs a connection pool.