epgsql icon indicating copy to clipboard operation
epgsql copied to clipboard

Crash in parsing state on {$2, <<>>} message

Open tsloughter opened this issue 11 years ago • 4 comments

I ran into a crash today where the pgsql_connection is in the parsing state, and from the sasl logs I see that reply={error, timeout} in the state record. It crashes because it receives the message {$2, <<>>} and that is not a valid message while in the parsing state.

I found it interesting that when parsing times out it continues on in the parsing state:

parsing(timeout, State) ->
    #state{timeout = Timeout} = State,
    Reply = {error, timeout},
    send(State, $S, []),
    {next_state, parsing, State#state{reply = Reply}, Timeout};

Why does it not go to the timeout state like querying?

querying(timeout, State) ->
    #state{sock = Sock, timeout = Timeout, backend = {Pid, Key}} = State,
    pgsql_sock:cancel(Sock, Pid, Key),
    {next_state, timeout, State, Timeout};

tsloughter avatar Mar 06 '13 01:03 tsloughter

Perhaps this is related. I've been getting the following error as well:

> pgsql:equery(<0.61.0>, <<"delete from test where id=$1 and key=$2;">> , [ <<"id1">> , <<"key1">> ] ).

call to undefined function 
pgsql_connection:parsing({parse,[],<<"delete from test where id=$1 and key=$2;">>,[]},

So I added parsing/3 to pgsql_connection, and this is what it logged for the 3 arguments:

* {parse,[],<<"delete from test_table where id=$1 and key=$2;">>,[]}
* From
* State

I am also using epgsql_pool. Incidentally, I wasn't getting timeout on individual queries, but when trying over a list comprehension of 10,100, I was able to reproduce the timeout ( on all nodes ). My solution was to return_connection everytime i get_connection.

Do you have any easily foreseen reasons why querying will timeout ?

Update: This issue could be similiar to https://github.com/zotonic/zotonic/issues/23 where they diagnosed this as

 @arjan : "I think that this is a problem where a single connection is used by two processes."

~B

bosky101 avatar Mar 13 '13 09:03 bosky101

Hi, sorry for the delayed response!

@tsloughter are you able to reproduce your problem? A minimal test case would be a huge help in diagnosing the issue. The reason it stays in the parsing state is because it sends a sync command and waits for the response before sending a reply to the client.

@bosky101 your issue doesn't look related to timeouts? It does look similar to zotonic's issue which was multiple processes attempting to use a single connection, which won't work.

wg avatar Mar 14 '13 12:03 wg

@wg Appreciate you taking time to respond. Yes, the zotonic issue helped. I was spawning a process before run_equery. So when i changed this to something less asynchronous, my error was resolved.

Here are some additional stats I logged for testing N writes with various connection handling techniques...

        %% Test(N) performs N inserts

        %%closing connection each time
       ([email protected])15> Test(10).     
       67931
       ([email protected])16> Test(100).    
       703595
       ([email protected])17> Test(1000).   
       7209715
       ([email protected])18> Test(10000).
       76474163

       %%returning connection each time instead of closing
       ([email protected])22> Test(10).             
       13726
       ([email protected])23> Test(100).
       114265
       ([email protected])24> Test(1000).
       990209
       ([email protected])25> Test(10000).
       10014166

       %% re-using connection  ( neither closing nor returning )
       ([email protected])7> Test(10).
       9201 
       ([email protected])8> Test(100).
       88039                                                        
       ([email protected])9> Test(1000).
       880612
       ([email protected])10> Test(10000).
       9582847

@wg Some Quick questions, if you don't mind...

  1. I noticed that lot of repo's that were using epgsql_pool to get_connection and store it in a #state.conn. Will re-using connection in this fashion be succeptable to the zotonic 'two processes, same connection" race conditions?

  2. Will these gen_server's have to further implement monitoring 'DOWN' ? ( Else the #state{} becomes stale when the connection is lost. but i haven't seen anyone implement this. eg:pool_boy, other odbc wrappers around epgsql_pool )

  3. What happens when 1 query is in the parsing state of a connection, when another query then enters the query. Is it better to wrap such equery's between get_connection & return_connection ?

~B

bosky101 avatar Mar 14 '13 13:03 bosky101

@bosky101 if a connection is stored in a process's state then I wouldn't think any other process would have access to it. They definitely would have to handle connection loss though.

I don't write much erlang anymore, but my preference was for creating pools of data accessor processes, each owning a connection. This seemed more in the spirit of erlang's concurrency model vs a connection pool.

wg avatar Mar 16 '13 05:03 wg