barman icon indicating copy to clipboard operation
barman copied to clipboard

Barman long connection timeout to offline or not existing database

Open mvk15 opened this issue 6 years ago • 2 comments

Hello,

We have an issue with long connection timeout to a database which is offline or not exists. It affects commands "barman check server" and "barman diagnose".

For example: "barman check" runs about 2 minutes:

$ time barman check barmandb
Server barmandb:
        WAL archive: FAILED (please make sure WAL shipping is setup)
        check timeout: FAILED (barman check command timed out)

real    2m11.233s
user    0m0.092s
sys     0m0.023s

And some more info in python profiler's log:

$ python -m cProfile -o /tmp/barman_cprofile.log /usr/bin/barman check barmandb
Server barmandb:
        WAL archive: FAILED (please make sure WAL shipping is setup)
        check timeout: FAILED (barman check command timed out)

Thu Nov 15 11:06:33 2018    barman_cprofile.log

         306149 function calls (304162 primitive calls) in 131.972 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000  131.973  131.973 /usr/bin/barman:22(<module>)
        1    0.000    0.000  131.914  131.914 /usr/lib/python2.7/dist-packages/barman/cli.py:1145(main)
        1    0.000    0.000  131.908  131.908 /usr/lib/python2.7/dist-packages/argh/helpers.py:53(dispatch)
        1    0.000    0.000  131.908  131.908 /usr/lib/python2.7/dist-packages/argh/dispatching.py:65(dispatch)
        1    0.000    0.000  131.908  131.908 /usr/lib/python2.7/dist-packages/argh/dispatching.py:210(_execute_command)
        1    0.000    0.000  131.902  131.902 /usr/lib/python2.7/dist-packages/argh/dispatching.py:228(_call)
        1    0.000    0.000  131.902  131.902 /usr/lib/python2.7/dist-packages/barman/cli.py:602(check)
        1    0.000    0.000  131.827  131.827 /usr/lib/python2.7/dist-packages/barman/server.py:418(check)
        1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/barman/server.py:550(check_postgres)
      2/1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/barman/remote_status.py:57(get_remote_status)
        1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/barman/server.py:863(fetch_remote_status)
        1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/barman/postgres.py:678(fetch_remote_status)
        1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/barman/postgres.py:213(server_version)
        1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/barman/postgres.py:421(connect)
        1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/barman/postgres.py:147(connect)
        1    0.000    0.000  131.826  131.826 /usr/lib/python2.7/dist-packages/psycopg2/__init__.py:85(connect)
        1  131.826  131.826  131.826  131.826 {psycopg2._psycopg._connect}
        1    0.000    0.000    0.075    0.075 /usr/lib/python2.7/dist-packages/barman/cli.py:997(get_server_list)
        1    0.000    0.000    0.065    0.065 /usr/lib/python2.7/dist-packages/barman/config.py:790(server_names)

System: Debian stretch in kubernetes pod Barman version: 2.5 python-psycopg2: 2.7.5-2.pgdg90+1

barman_cprofile.log barmandb.diagnose.log

mvk15 avatar Nov 15 '18 11:11 mvk15

Found that connection timeout for "psycopg2" can be configured by adding Postgres connection parameter "connect_timeout" https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS

For example if connect_timeout is 2 seconds, check takes about 14 seconds:

conninfo = "host=barman user=barman dbname=barman connect_timeout=2"
streaming_conninfo = "host=barman user=streaming_barman dbname=barman connect_timeout=2"

# time barman check barmandb
Server barmandb:
        WAL archive: FAILED (please make sure WAL shipping is setup)
        PostgreSQL: FAILED
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: FAILED (PostgreSQL version: None, pg_basebackup version: 11.1-1.pgdg90+1))
        pg_receivexlog: OK
        pg_receivexlog compatible: FAILED (PostgreSQL version: None, pg_receivexlog version: 11.1-1.pgdg90+1))
        receive-wal running: FAILED (See the Barman log file for more details)
        archiver errors: OK

real    0m14.638s
user    0m0.324s
sys     0m0.208s

And if connect_timeout is 5 seconds, Barman uses its option "check_timeout" with default value 30 seconds and check takes about 30 seconds

conninfo = "host=barman user=barman dbname=barman connect_timeout=5"
streaming_conninfo = "host=barman user=streaming_barman dbname=barman connect_timeout=5"

# time barman check barmandb
Server barmandb:
        WAL archive: FAILED (please make sure WAL shipping is setup)
        PostgreSQL: FAILED
        directories: OK
        retention policy settings: OK
        backup maximum age: OK (no last_backup_maximum_age provided)
        compression settings: OK
        failed backups: OK (there are 0 failed backups)
        minimum redundancy requirements: OK (have 0 backups, expected at least 0)
        pg_basebackup: OK
        pg_basebackup compatible: FAILED (PostgreSQL version: None, pg_basebackup version: 11.1-1.pgdg90+1))
        check timeout: FAILED (barman check command timed out)

real    0m31.112s
user    0m0.308s
sys     0m0.224s

mvk15 avatar Nov 16 '18 09:11 mvk15

Hi @mvk15 - just working through the backlog of issues and wanted to query this one, is there still an issue here given it is possible to specify the connection_timeout in the conninfo?

It does seems odd that the barman_check timeout does not fire at the expected 30s in the first case you report, where there is no connection_timeout.

mikewallace1979 avatar Oct 29 '21 16:10 mikewallace1979