go-ora icon indicating copy to clipboard operation
go-ora copied to clipboard

Handle broken pipe gracefully

Open Noon2Dusk opened this issue 2 years ago • 2 comments

In case of restart of DB server instance, graceful handling of broken pipe error with which would also try to reconnect would be a good solution.

Noon2Dusk avatar May 09 '22 09:05 Noon2Dusk

I think I need to create custom sql package

sijms avatar May 10 '22 20:05 sijms

Hi We have to sql.DB connections in our api simultanously, one for postgresq using https://github.com/lib/pq and one for oracle using https://github.com/sijms/go-ora. but unfortunately pq driver reconnects and recovers panic state and oracle driver does not, I think you shoould take a look in conn.go and error.go in pq project.

hmmftg avatar Aug 13 '22 10:08 hmmftg

I think now I can make failover at stmt.Query and stmt.Exec level means if connection failed when you start query or exec it will reconnect but if error happen in fetch method => during rows.Next() and rows.Scan the driver will return error

sijms avatar Nov 24 '22 18:11 sijms

I add support for failover please test and write your feedback

sijms avatar Nov 24 '22 23:11 sijms

Hi we still have the broken pipe problem:

stable state:

[GIN] 2022/11/30 - 10:46:42 | 200 |  159.599297ms |       127.0.0.1 | GET      "/"
[GIN] 2022/11/30 - 10:46:42 | 200 |  160.122278ms |       127.0.0.1 | GET      "/"

unplug network cable and wait 10 minutes:

Error Query: read tcp ip1:34310->ip.server:1521: read: connection reset by peer, <nil>, [], SELECT * FROM table WHERE ROWNUM < 100 
[GIN] 2022/11/30 - 10:50:02 | 500 |         3m13s |       127.0.0.1 | GET      "/"
[GIN] 2022/11/30 - 10:50:02 | 500 |         3m13s |       127.0.0.1 | GET      "/"

plug in the cable:

Error Query: write tcp ip1:34310->ip.server:1521: write: broken pipe, <nil>, [], SELECT * FROM table WHERE ROWNUM < 100 
[GIN] 2022/11/30 - 10:50:06 | 500 |    1.000311ms |       127.0.0.1 | GET      "/"
[GIN] 2022/11/30 - 10:50:06 | 500 |    1.067927ms |       127.0.0.1 | GET      "/"
Error Query: write tcp ip1:34310->ip.server:1521: write: broken pipe, <nil>, [], SELECT * FROM table WHERE ROWNUM < 100 
[GIN] 2022/11/30 - 10:50:12 | 500 |     408.362µs |       127.0.0.1 | GET      "/"
[GIN] 2022/11/30 - 10:50:12 | 500 |      443.56µs |       127.0.0.1 | GET      "/"

hmmftg avatar Nov 30 '22 07:11 hmmftg

when I make a testing environment for failover I do the following 1- run a simple go program in debug mode 2- stop just after Ping 3- go to the server which is simply a docker image 4- stop database and then startup again 5- at this point I receive error EOF

so I start by add error EOF as a starter for failover technique

I think you receive now different error?

sijms avatar Nov 30 '22 11:11 sijms

if you can identify the type of the error and to faster the process you can modify function _query in file command.go v2 at line #1601 and line #1615

if errors.Is(err, io.EOF) || errors.Is(err, <returned error>){

sijms avatar Nov 30 '22 12:11 sijms

so I will not close this issue until we catch all possible errors that related to broken pipe

sijms avatar Nov 30 '22 12:11 sijms

any news here?

sijms avatar Dec 08 '22 16:12 sijms

Hi, I face the same problem (error telling about broken pipe). Here is how you can simulate a failover scenario.

Use docker-compose with two container instances of Oracle DB on different ports. Then you can have a connection string like this: (DESCRIPTION=(ADDRESS_LIST=(LOAD_BALANCE=OFF)(FAILOVER=ON)(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.2.55)(PORT=1521))(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.2.55)(PORT=1522)))(CONNECT_DATA=(SERVICE_NAME=XEPDB1)(SERVER=DEDICATED)))

If you run an example app which is executing a query for example every 10 sec (SELECT SYSDATE FROM DUAL) you can stop and start any of those two instances and see what happens in the application. There is no failover unfortunately.

If you need more details about how to set up the environment, please let me know. Best regards

agunglotto avatar Dec 11 '22 14:12 agunglotto

Sorry, I realized that 2.5.16 has some more fixes. I switched to that version and now it works as expected. Good job!

agunglotto avatar Dec 11 '22 20:12 agunglotto

is the issue already fixed ? i still experience broken pipe problem with this version github.com/sijms/go-ora/v2 v2.6.3

how to reproduce the problem :

  1. Run server
  2. Idle for 1 day
  3. Make a request that using database connection
  4. Got error log err:write tcp 192.168.XX.XX:XXXXX->192.168.XX.XX:XXXX: write: broken pipe..

ducknificient avatar Mar 27 '23 01:03 ducknificient

I am working on it now

sijms avatar Mar 27 '23 03:03 sijms

@kalengbekas would you please test it with latest version

sijms avatar Mar 27 '23 22:03 sijms

okay, i'll back in a few days. Thank you

ducknificient avatar Mar 28 '23 01:03 ducknificient

after updating to v2.6.10 and adding the setConnMaxLifetime & MaxIdleTime,the error is gone. Until now the server still up. Thank you

ducknificient avatar Mar 30 '23 01:03 ducknificient

I add a new option in v2.6.11 RETRYTIME which represents number of seconds that will pass before client will re-connect

sijms avatar Mar 30 '23 02:03 sijms