gpdb
gpdb copied to clipboard
Use option to ignore fsync when using pg_basebackup if needed after we merged to PG 10.
https://github.com/greenplum-db/gpdb/pull/8910 inspired me another enhancement for testing.
Since PG10, pg_basebackup supports an option to ignoring fsync. So when we merged to PG10, we could modify some related code to support that (.e.g. create demo cluster script, isolation2 test script, and gpMgmt utility) to speed up installation or testing.
We in general run with fsync off during testing as demo cluster is created with fsync off, do we need to do more?
That seems to be related to the fsync guc to my best of knowledge, which affects the kernel code only, not pg_basebackup.
I agree with @paul-guo-, the fsync GUC has no effect on the fsync operation performed by pg_basebackup
at the end of extracting the tar file at the destination. One code path is: BaseBackup() --> fsync_pgdata() --> file_utils.c:fsync_fname
. And the option is
-N, --no-sync do not wait for changes to be written safely to disk
Currently, in our tests, pg_basebackup
invokes open()
, fsync()
and close()
for each file within PGDATA and user-defiled tablespace directories, if any. Avoiding all of that work seems like a good performance benefit in CI.
This [1] message states per file fsync()
is not needed even in production mode and should be removed. For reference found this was implemented via
commit 522baf14847a7e4cc97c49c7b1c28d21bc33921f
Author: Michael Paquier <[email protected]>
Date: Wed Sep 4 13:21:11 2019 +0900
Delay fsyncs of pg_basebackup until the end of backup
[1] https://www.postgresql.org/message-id/20190127125857.GB4672%40paquier.xyz
I've pushed the fix for pg_basebackup in isolation2 test in https://github.com/greenplum-db/gpdb/pull/11342. For the enhancement in demo cluster creation we need to modify the python utility code. That fix needs more efforts and I do not think it is high priority for me or maybe others.
the -N
arg of pg_basebackup can disable the fsync now on master. closing this issue.
the
-N
arg of pg_basebackup can disable the fsync now on master. closing this issue.
Do you find the code /gpMgmt/bin/gppylib/commands/gp.py:PgBaseBackup
take advantage of this new option?
For the enhancement in demo cluster creation we need to modify the python utility code. That fix needs more efforts and I > do not think it is high priority for me or maybe others.
I do not find the code that implement the above improvement mentioned by @paul-guo- in https://github.com/greenplum-db/gpdb/issues/8913#issuecomment-758337181