gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Use option to ignore fsync when using pg_basebackup if needed after we merged to PG 10.

Open paul-guo- opened this issue 5 years ago • 7 comments

https://github.com/greenplum-db/gpdb/pull/8910 inspired me another enhancement for testing.

Since PG10, pg_basebackup supports an option to ignoring fsync. So when we merged to PG10, we could modify some related code to support that (.e.g. create demo cluster script, isolation2 test script, and gpMgmt utility) to speed up installation or testing.

paul-guo- avatar Oct 25 '19 03:10 paul-guo-

We in general run with fsync off during testing as demo cluster is created with fsync off, do we need to do more?

ashwinstar avatar Oct 25 '19 04:10 ashwinstar

That seems to be related to the fsync guc to my best of knowledge, which affects the kernel code only, not pg_basebackup.

paul-guo- avatar Oct 25 '19 05:10 paul-guo-

I agree with @paul-guo-, the fsync GUC has no effect on the fsync operation performed by pg_basebackup at the end of extracting the tar file at the destination. One code path is: BaseBackup() --> fsync_pgdata() --> file_utils.c:fsync_fname. And the option is

  -N, --no-sync          do not wait for changes to be written safely to disk

Currently, in our tests, pg_basebackup invokes open(), fsync() and close() for each file within PGDATA and user-defiled tablespace directories, if any. Avoiding all of that work seems like a good performance benefit in CI.

asimrp avatar Oct 25 '19 06:10 asimrp

This [1] message states per file fsync() is not needed even in production mode and should be removed. For reference found this was implemented via

commit 522baf14847a7e4cc97c49c7b1c28d21bc33921f
Author: Michael Paquier <[email protected]>
Date:   Wed Sep 4 13:21:11 2019 +0900

    Delay fsyncs of pg_basebackup until the end of backup

[1] https://www.postgresql.org/message-id/20190127125857.GB4672%40paquier.xyz

ashwinstar avatar Dec 28 '19 02:12 ashwinstar

I've pushed the fix for pg_basebackup in isolation2 test in https://github.com/greenplum-db/gpdb/pull/11342. For the enhancement in demo cluster creation we need to modify the python utility code. That fix needs more efforts and I do not think it is high priority for me or maybe others.

paul-guo- avatar Jan 12 '21 01:01 paul-guo-

the -N arg of pg_basebackup can disable the fsync now on master. closing this issue.

lij55 avatar Aug 04 '22 23:08 lij55

the -N arg of pg_basebackup can disable the fsync now on master. closing this issue.

Do you find the code /gpMgmt/bin/gppylib/commands/gp.py:PgBaseBackup take advantage of this new option?

For the enhancement in demo cluster creation we need to modify the python utility code. That fix needs more efforts and I > do not think it is high priority for me or maybe others.

I do not find the code that implement the above improvement mentioned by @paul-guo- in https://github.com/greenplum-db/gpdb/issues/8913#issuecomment-758337181

kainwen avatar Aug 04 '22 23:08 kainwen