pgbackrest icon indicating copy to clipboard operation
pgbackrest copied to clipboard

Pgbackrest poor performance after upgrading from 2.35 to 2.38

Open jlentzsesco opened this issue 3 years ago • 17 comments
trafficstars

Please provide the following information when submitting an issue (feature requests or general comments can skip this):

  1. pgBackRest version: 2.38

  2. PostgreSQL version: 12.7-2

  3. Operating system/version - if you have more than one server (for example, a database server, a repository host server, one or more standbys), please specify each: RHEL 8.4

  4. Did you install pgBackRest from source or from a package? package

  5. Please attach the following as applicable:

    • pgbackrest.conf file(s) [root@LPGSG305D ~]# cat /etc/pgbackrest/pgbackrest.conf [global] repo1-path=/opt/postgres/pgbackup/pgbackrest repo1-retention-full=2 process-max=12 start-fast=y backup-standby=y log-level-console=info compress-type=lz4 compress-level=6

[dayzerprd] pg1-path=/opt/postgres/pgdata-sesco/data pg2-host= pg2-host-user=postgres pg2-path=/opt/postgres/pgdata-sesco/data

- `postgresql.conf` settings applicable to pgBackRest (`archive_command`, `archive_mode`, `listen_addresses`, `max_wal_senders`, `wal_level`, `port`)

archive_command = 'pgbackrest --stanza=dayzerprd archive-push %p' archive_mode = on listen_addresses = '*' port = 5432 max_wal_senders = 16 wal_level = replica

- errors in the postgresql log file before or during the time you experienced the issue

log contains no specific pgbackrest errors, frequent timeouts due to system being slow. 2022-05-10 07:37:54.274 EDT [1412097] [user=pgwatch2,db=dayzerprd,app=pgwatch2] ERROR: canceling statement due to statement timeout 2022-05-10 07:37:54.973 EDT [1412108] [user=pgwatch2,db=dayzerprd,app=pgwatch2] ERROR: canceling statement due to statement timeout 2022-05-10 07:37:59.354 EDT [1412113] [user=pgwatch2,db=dayzerprd,app=pgwatch2] ERROR: canceling statement due to statement timeout 2022-05-10 07:37:59.994 EDT [1412118] [user=pgwatch2,db=dayzerprd,app=pgwatch2] ERROR: canceling statement due to statement timeout 2022-05-10 07:38:04.359 EDT [1412119] [user=pgwatch2,db=dayzerprd,app=pgwatch2] ERROR: canceling statement due to statement timeout 2022-05-10 07:38:04.366 EDT [1412125] [user=pgwatch2,db=dayzerprd,app=pgwatch2] ERROR: permission denied for function pg_stat_file 2022-05-10 07:38:06.182 EDT [1412124] [user=pgwatch2,db=dayzerprd,app=pgwatch2] ERROR: canceling statement due to statement timeout

- log file in `/var/log/pgbackrest` for the commands run (e.g. `/var/log/pgbackrest/mystanza_backup.log`)

2022-05-08 15:36:34.863 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive 2022-05-08 15:36:35.074 P00 INFO: backup stop archive = 000000010000175B00000052, lsn = 175B/52908558 2022-05-08 15:36:35.075 P00 INFO: check archive for segment(s) 000000010000175900000011:000000010000175B00000052 2022-05-08 15:36:35.758 P00 INFO: new backup label = 20220430-010005F_20220508-010002D 2022-05-08 15:36:36.071 P00 INFO: diff backup size = 1377.9GB, file total = 16100 2022-05-08 15:36:36.073 P00 INFO: backup command end: completed successfully (52594948ms)

-------------------PROCESS START------------------- 2022-05-09 01:00:02.022 P00 INFO: backup command begin 2.38: --backup-standby --compress-level=6 --compress-type=lz4 --exec-id=915023-d949506c --log-level-console=info --pg2-host=lpgsg304d --pg2-host-user=postgres --pg1-path=/opt/postgres/pgdata-sesco/data --pg2-path=/opt/postgres/pgdata-sesco/data --process-max=12 --repo1-path=/opt/postgres/pgbackup/pgbackrest --repo1-retention-full=2 --stanza=dayzerprd --start-fast --type=diff 2022-05-09 01:00:04.068 P00 INFO: last backup label = 20220430-010005F, version = 2.38 2022-05-09 01:00:04.068 P00 INFO: execute non-exclusive pg_start_backup(): backup begins after the requested immediate checkpoint completes 2022-05-09 01:00:04.871 P00 INFO: backup start archive = 000000010000175B00000058, lsn = 175B/58000028 2022-05-09 01:00:04.871 P00 INFO: wait for replay on the standby to reach 175B/58000028 2022-05-09 01:00:05.272 P00 INFO: replay on the standby reached 175B/58000028 2022-05-09 01:00:05.272 P00 INFO: check archive for prior segment 000000010000175B00000057 2022-05-09 01:00:06.209 P00 WARN: file 'base/16399/39612_vm' has same timestamp as prior but different size, enabling delta checksum 2022-05-09 15:38:09.942 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive 2022-05-09 15:38:10.144 P00 INFO: backup stop archive = 000000010000175D000000C9, lsn = 175D/C9F7AE48 2022-05-09 15:38:10.145 P00 INFO: check archive for segment(s) 000000010000175B00000058:000000010000175D000000C9 2022-05-09 15:38:12.036 P00 INFO: new backup label = 20220430-010005F_20220509-010003D 2022-05-09 15:38:12.529 P00 INFO: diff backup size = 1393.9GB, file total = 16106 2022-05-09 15:38:12.531 P00 INFO: backup command end: completed successfully (52690510ms)

-------------------PROCESS START------------------- 2022-05-10 01:00:01.416 P00 INFO: backup command begin 2.38: --backup-standby --compress-level=6 --compress-type=lz4 --exec-id=1304428-0305853f --log-level-console=info --pg2-host=lpgsg304d --pg2-host-user=postgres --pg1-path=/opt/postgres/pgdata-sesco/data --pg2-path=/opt/postgres/pgdata-sesco/data --process-max=12 --repo1-path=/opt/postgres/pgbackup/pgbackrest --repo1-retention-full=2 --stanza=dayzerprd --start-fast --type=diff 2022-05-10 01:00:03.512 P00 INFO: last backup label = 20220430-010005F, version = 2.38 2022-05-10 01:00:03.512 P00 INFO: execute non-exclusive pg_start_backup(): backup begins after the requested immediate checkpoint completes 2022-05-10 01:00:04.114 P00 INFO: backup start archive = 000000010000175E00000081, lsn = 175E/81000028 2022-05-10 01:00:04.114 P00 INFO: wait for replay on the standby to reach 175E/81000028 2022-05-10 01:00:04.515 P00 INFO: replay on the standby reached 175E/81000028 2022-05-10 01:00:04.515 P00 INFO: check archive for prior segment 000000010000175E00000080 2022-05-10 01:00:05.208 P00 WARN: file 'base/16399/39612_vm' has same timestamp as prior but different size, enabling delta checksum 2022-05-10 09:20:38.483 P00 INFO: backup command end: terminated on signal [SIGTERM]

  1. Describe the issue: System performance becomes poor when pgbackrest is running. System ends up with what appears to be normal iowait and load average for a backup, however the system is slow to respond to commands of any type when the backup is running. This started after upgrading from 2.35 to 2.38

jlentzsesco avatar May 10 '22 14:05 jlentzsesco

Looks to me like you have a bad timestamp:

WARN: file 'base/16399/39612_vm' has same timestamp as prior but different size, enabling delta checksum

This is forcing pgBackRest into delta checksum mode (which is slower) for reliability since it looks like your system is not updating timestamps correctly. Any chance you running postgres on NFS? We've seen timestamp issues with NFS in the past.

To resolve this you'll need to make a new full backup.

dwsteele avatar May 11 '22 14:05 dwsteele

Its not NFS, it is a local ext4 filesystem.

@.*** Jasen M. Lentz, M.Ed Lead Systems Administrator Sesco Enterprises, LLC 4977 US Route 30 Greensburg, PA 15601 W: (724) 837-1991 x207 C: (412) 848-5612

From: David Steele @.> Sent: Wednesday, May 11, 2022 10:50 AM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

Looks to me like you have a bad timestamp:

WARN: file 'base/16399/39612_vm' has same timestamp as prior but different size, enabling delta checksum

This is forcing pgBackRest into delta checksum mode (which is slower) for reliability since it looks like your system is not updating timestamps correctly. Any chance you running postgres on NFS? We've seen timestamp issues with NFS in the past.

To resolve this you'll need to make a new full backup.

Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1123877842&data=05%7C01%7Cjlentz%40sescollc.com%7C23973ff5afdb4bfe46fa08da335d8758%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637878774044448093%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=UNwNi5KyRIr49l747o%2BfqCZJVHmVGW2bCjD53kedwh0%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WR7XSMBN6N3TPYRNA3VJPCJTANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7C23973ff5afdb4bfe46fa08da335d8758%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637878774044448093%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=yhrL8UIhEWZQDY97ZL4A%2BZg3jyovcuiHGfxYLzjiR%2Bg%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.@.>>

jlentzsesco avatar May 11 '22 15:05 jlentzsesco

Its not NFS, it is a local ext4 filesystem.

That's a bit worrisome then. The file size changed from the full backup but the timestamp did not. Could be a fluke, but could also indicate a problem with the host.

dwsteele avatar May 11 '22 15:05 dwsteele

David,

We contacted Redhat and they told us to change an option on the mounts, which we did. One file keeps showing up as changing and we have diffed the file and it actually is not changing but pgbackrest thinks it is and throwing it into deltas.

Mon May 16 08:52:09 EDT 2022 -rw------- 1 postgres postgres 8978432 Apr 3 10:56 /opt/postgres/pgdata/data/base/16399/39612_vm Tue May 17 07:47:14 EDT 2022 -rw------- 1 postgres postgres 8978432 Apr 3 10:56 /opt/postgres/pgdata/data/base/16399/39612_vm

@.*** Jasen M. Lentz, M.Ed Lead Systems Administrator Sesco Enterprises, LLC 4977 US Route 30 Greensburg, PA 15601 W: (724) 837-1991 x207 C: (412) 848-5612

From: David Steele @.> Sent: Wednesday, May 11, 2022 11:12 AM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

Its not NFS, it is a local ext4 filesystem.

That's a bit worrisome then. The file size changed from the full backup but the timestamp did not. Could be a fluke, but could also indicate a problem with the host.

Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1123910329&data=05%7C01%7Cjlentz%40sescollc.com%7C3d51c6a744ff4ff681e908da33609782%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637878787205841403%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iISv4fIWZm8mIQeDz6vcbyxVLjf7EWo0ADqFCuTkwsw%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WQ6A2U72SBUA4G76RTVJPE3ZANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7C3d51c6a744ff4ff681e908da33609782%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637878787205841403%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=T3%2F0RuI4YmF3DvsqpTr6Z9pNUM24El64C2FD%2FvS4ILQ%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.@.>>

jlentzsesco avatar May 17 '22 12:05 jlentzsesco

What was the mount option that red hat had you change...?

What you pasted looks identical, at least to what I'm seeing. What exactly is pgbackrest reporting/logging?

sfrost avatar May 17 '22 13:05 sfrost

They had us add strictatime to the mount options because otherwise the timestams seems to be a bit lazy. This keeps them precise.

The log:

2022-05-15 15:56:28.718 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive 2022-05-15 15:56:28.976 P00 INFO: backup stop archive = 0000000100001783000000BC, lsn = 1783/BCDDF4D0 2022-05-15 15:56:28.977 P00 INFO: check archive for segment(s) 000000010000177B00000011:0000000100001783000000BC 2022-05-15 15:56:31.163 P00 INFO: new backup label = 20220514-010003F_20220515-010003D 2022-05-15 15:56:31.670 P00 INFO: diff backup size = 142.5GB, file total = 16186 2022-05-15 15:56:31.671 P00 INFO: backup command end: completed successfully (53790173ms) -------------------PROCESS START------------------- 2022-05-16 01:00:01.166 P00 INFO: backup command begin 2.38: --backup-standby --compress-level=6 --compress-type=lz4 --exec-id=3763917-1479d94f --log-level-console=info --pg2-host=lpgsg304d --pg2-host-user=postgres --pg1-path=/opt/postgres/pgdata-sesco/data --pg2-path=/opt/postgres/pgdata-sesco/data --process-max=12 --repo1-path=/opt/postgres/pgbackup/pgbackrest --repo1-retention-full=2 --stanza=dayzerprd --start-fast --type=diff 2022-05-16 01:00:03.502 P00 INFO: last backup label = 20220514-010003F, version = 2.38 2022-05-16 01:00:03.502 P00 INFO: execute non-exclusive pg_start_backup(): backup begins after the requested immediate checkpoint completes 2022-05-16 01:00:04.105 P00 INFO: backup start archive = 000000010000178400000055, lsn = 1784/55000028 2022-05-16 01:00:04.105 P00 INFO: wait for replay on the standby to reach 1784/55000028 2022-05-16 01:00:04.506 P00 INFO: replay on the standby reached 1784/55000028 2022-05-16 01:00:04.506 P00 INFO: check archive for prior segment 000000010000178400000054 2022-05-16 01:00:05.208 P00 WARN: file 'base/16399/39612_vm' has same timestamp as prior but different size, enabling delta checksum 2022-05-16 15:26:46.140 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive 2022-05-16 15:26:46.370 P00 INFO: backup stop archive = 00000001000017860000009D, lsn = 1786/9DAAF9C8 2022-05-16 15:26:46.371 P00 INFO: check archive for segment(s) 000000010000178400000055:00000001000017860000009D 2022-05-16 15:26:47.264 P00 INFO: new backup label = 20220514-010003F_20220516-010003D 2022-05-16 15:26:47.672 P00 INFO: diff backup size = 191.6GB, file total = 16215 2022-05-16 15:26:47.674 P00 INFO: backup command end: completed successfully (52006509ms) -------------------PROCESS START------------------- 2022-05-17 01:00:01.190 P00 INFO: backup command begin 2.38: --backup-standby --compress-level=6 --compress-type=lz4 --exec-id=4153370-b6866540 --log-level-console=info --pg2-host=lpgsg304d --pg2-host-user=postgres --pg1-path=/opt/postgres/pgdata-sesco/data --pg2-path=/opt/postgres/pgdata-sesco/data --process-max=12 --repo1-path=/opt/postgres/pgbackup/pgbackrest --repo1-retention-full=2 --stanza=dayzerprd --start-fast --type=diff 2022-05-17 01:00:03.386 P00 INFO: last backup label = 20220514-010003F, version = 2.38 2022-05-17 01:00:03.387 P00 INFO: execute non-exclusive pg_start_backup(): backup begins after the requested immediate checkpoint completes 2022-05-17 01:00:03.989 P00 INFO: backup start archive = 0000000100001786000000AB, lsn = 1786/AB000028 2022-05-17 01:00:03.989 P00 INFO: wait for replay on the standby to reach 1786/AB000028 2022-05-17 01:00:04.390 P00 INFO: replay on the standby reached 1786/AB000028 2022-05-17 01:00:04.390 P00 INFO: check archive for prior segment 0000000100001786000000AA 2022-05-17 01:00:05.209 P00 WARN: file 'base/16399/39612_vm' has same timestamp as prior but different size, enabling delta checksum

@.*** Jasen M. Lentz, M.Ed Lead Systems Administrator Sesco Enterprises, LLC 4977 US Route 30 Greensburg, PA 15601 W: (724) 837-1991 x207 C: (412) 848-5612

From: Stephen Frost @.> Sent: Tuesday, May 17, 2022 9:49 AM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

What was the mount option that red hat had you change...?

What you pasted looks identical, at least to what I'm seeing. What exactly is pgbackrest reporting/logging?

Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1128895417&data=05%7C01%7Cjlentz%40sescollc.com%7C75a5e2f1f54447f9693108da380c057a%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883921540351671%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=eOPYvGL%2B%2B%2BZxsKfe8f7YSg%2BrbhPepMmlh9oGzDbpCLM%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WSN57KX4VFWJX4R6ZLVKOPVLANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7C75a5e2f1f54447f9693108da380c057a%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883921540351671%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YtXY6%2FDYnVYyKFdt4DLukxhypDoKvypnll5I3uJruoM%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.@.>>

jlentzsesco avatar May 17 '22 13:05 jlentzsesco

Is this a compressed filesystem somehow..? What does a 'du' on that file at both times return?

sfrost avatar May 17 '22 14:05 sfrost

(Also, pgbackrest doesn't use atime, so strictatime shouldn't be making any difference and I seriously doubt that it is)

sfrost avatar May 17 '22 14:05 sfrost

You'd need to look in /opt/postgres/pgbackup/pgbackrest/backup/dayzerprd/20220514-010003F/backup.manifest for base/16399/39612_vm and see what the size/timestamp was when that full backup was made.

dwsteele avatar May 17 '22 14:05 dwsteele

My DBA also noticed that the timestamp is the same within its own server but differs among replicas. He checked on another set of servers we have running 2.38 and they are not having this issue and their timestamps are a minute apart.

-------------------PROCESS START------------------- 2022-05-17 00:00:01.529 P00 INFO: backup command begin 2.38: --backup-standby --compress-level=9 --compress-type=lz4 --exec-id=108930-00f5e3d0 --log-level-console=info --pg2-host=lpgsg302d --pg2-host-user=postgres --pg1-path=/opt/postgres/pgdata-sesco/data --pg2-path=/opt/postgres/pgdata-sesco/data --process-max=16 --repo1-path=/opt/postgres/pgbackup/pgbackrest --repo1-retention-full=2 --stanza=sescoprd --start-fast --type=diff 2022-05-17 00:00:03.690 P00 INFO: last backup label = 20220514-000223F, version = 2.38 2022-05-17 00:00:03.690 P00 INFO: execute non-exclusive pg_start_backup(): backup begins after the requested immediate checkpoint completes 2022-05-17 00:01:42.816 P00 INFO: backup start archive = 000000020000C55A00000006, lsn = C55A/69C2AE50 2022-05-17 00:01:42.816 P00 INFO: wait for replay on the standby to reach C55A/69C2AE50 2022-05-17 00:02:43.983 P00 INFO: replay on the standby reached C55A/69C2AE50 2022-05-17 00:02:43.983 P00 INFO: check archive for prior segment 000000020000C55A00000005 2022-05-17 00:02:46.209 P00 WARN: file 'base/54195/131487_fsm' has same timestamp as prior but different size, enabling delta checksum 2022-05-17 08:00:03.180 P00 INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive 2022-05-17 08:00:04.491 P00 INFO: backup stop archive = 000000020000C5720000000A, lsn = C572/A70CDD58 2022-05-17 08:00:04.496 P00 INFO: check archive for segment(s) 000000020000C55A00000006:000000020000C5720000000A 2022-05-17 08:00:24.320 P00 INFO: new backup label = 20220514-000223F_20220517-000003D 2022-05-17 08:00:25.031 P00 INFO: diff backup size = 876.2GB, file total = 32096 2022-05-17 08:00:25.033 P00 INFO: backup command end: completed successfully (28823505ms)

@.*** Jasen M. Lentz, M.Ed Lead Systems Administrator Sesco Enterprises, LLC 4977 US Route 30 Greensburg, PA 15601 W: (724) 837-1991 x207 C: (412) 848-5612

From: David Steele @.> Sent: Tuesday, May 17, 2022 10:12 AM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

You'd need to look in /opt/postgres/pgbackup/pgbackrest/backup/dayzerprd/20220514-010003F/backup.manifest for base/16399/39612_vm and see what the size/timestamp was when that full backup was made.

Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1128922802&data=05%7C01%7Cjlentz%40sescollc.com%7Cafbb69a8a36645b993c208da380f3de3%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883935368108677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CEbb1pIiD1Ggz2F2dNPe6kT8EagBxAbYspOr0MHBx94%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WVTU2D6MY37ITA7GULVKOSLZANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7Cafbb69a8a36645b993c208da380f3de3%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883935368108677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ByZn2Q0q5yAjaIAWqoHHkrkAs1mLduqLIGr8R98ndCs%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.@.>>

jlentzsesco avatar May 17 '22 15:05 jlentzsesco

If possible, it would be good to get a copy of 20220514-010003F/backup.manifest and 20220514-010003F_20220516-010003D/backup.manifest. Then we can compare and see what is going on. It looks like stat might be returning a different size than the actual size as copied.

dwsteele avatar May 17 '22 15:05 dwsteele

I have them, too large to e-mail, how should I get them to you?

@.*** Jasen M. Lentz, M.Ed Lead Systems Administrator Sesco Enterprises, LLC 4977 US Route 30 Greensburg, PA 15601 W: (724) 837-1991 x207 C: (412) 848-5612

From: David Steele @.> Sent: Tuesday, May 17, 2022 11:17 AM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

If possible, it would be good to get a copy of 20220514-010003F/backup.manifest and 20220514-010003F_20220516-010003D/backup.manifest. Then we can compare and see what is going on. It looks like stat might be returning a different size than the actual size as copied.

Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1128999182&data=05%7C01%7Cjlentz%40sescollc.com%7Cd7570d4f5ccb452511a708da38183e52%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883974054700041%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=KAbnRRpyLY00lJx3ZYuf1MtkiMD2lcJ4xkScB0As%2F0E%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WTUHMBANFCLG5HTUZ3VKOZ5PANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7Cd7570d4f5ccb452511a708da38183e52%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883974054700041%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Bmr5uAVtqkA6g%2BG8Lwa7piNbo4Ig3fnR7pAc%2FlTbgeo%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.@.>>

jlentzsesco avatar May 17 '22 15:05 jlentzsesco

Looks like the file did not change:

20220514-010003F:
pg_data/base/16399/39612_vm={"checksum":"958d083742eb0351a064906a76f5bb9bfebe39eb","repo-size":48050,"size":8978432,"timestamp":1648986341}

20220514-010003F_20220516-010003D:
pg_data/base/16399/39612_vm={"checksum":"958d083742eb0351a064906a76f5bb9bfebe39eb","repo-size":48050,"size":8978432,"timestamp":1648986341}

So it appears we are getting a different value from stat.

What kind of storage is this? Is there compression?

dwsteele avatar May 17 '22 18:05 dwsteele

Three 12TB HDD in a RAID5 hardware config (Dell) ext4 file system with no compression

Get Outlook for iOShttps://aka.ms/o0ukef


From: David Steele @.> Sent: Tuesday, May 17, 2022 2:26:27 PM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

Looks like the file did not change:

20220514-010003F: pg_data/base/16399/39612_vm={"checksum":"958d083742eb0351a064906a76f5bb9bfebe39eb","repo-size":48050,"size":8978432,"timestamp":1648986341}

20220514-010003F_20220516-010003D: pg_data/base/16399/39612_vm={"checksum":"958d083742eb0351a064906a76f5bb9bfebe39eb","repo-size":48050,"size":8978432,"timestamp":1648986341}

So it appears we are getting a different value from stat.

What kind of storage is this? Is there compression?

— Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1129185164&data=05%7C01%7Cjlentz%40sescollc.com%7Cff817567bc894fe80b2808da3832c26e%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637884087921638105%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CDGcGrAfVy5wYPSfUxArEYx39XRYvLhn%2Fcf9%2Bor%2BVBI%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WVVRVE63BOZWV44JS3VKPQFHANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7Cff817567bc894fe80b2808da3832c26e%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637884087921638105%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zR1JsjgeXoqP2Pwrzgst%2BWQaZjmJjPkBvIz0WxVxlzQ%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>

jlentzsesco avatar May 17 '22 19:05 jlentzsesco

... what do you get when you run stat on that file yourself?

sfrost avatar May 17 '22 20:05 sfrost

Actual file:

File: ./39612_vm Size: 8978432 Blocks: 17544 IO Block: 4096 regular file Device: fd09h/64777d Inode: 367276618 Links: 1 Access: (0600/-rw-------) Uid: ( 26/postgres) Gid: ( 26/postgres) Access: 2022-05-17 15:23:17.801224224 -0400 Modify: 2022-04-03 10:56:50.791539668 -0400 Change: 2022-04-03 10:56:50.791539668 -0400 Birth: -

@.*** Jasen M. Lentz, M.Ed Lead Systems Administrator Sesco Enterprises, LLC 4977 US Route 30 Greensburg, PA 15601 W: (724) 837-1991 x207 C: (412) 848-5612

From: David Steele @.> Sent: Tuesday, May 17, 2022 2:26 PM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

Looks like the file did not change:

20220514-010003F:

pg_data/base/16399/39612_vm={"checksum":"958d083742eb0351a064906a76f5bb9bfebe39eb","repo-size":48050,"size":8978432,"timestamp":1648986341}

20220514-010003F_20220516-010003D:

pg_data/base/16399/39612_vm={"checksum":"958d083742eb0351a064906a76f5bb9bfebe39eb","repo-size":48050,"size":8978432,"timestamp":1648986341}

So it appears we are getting a different value from stat.

What kind of storage is this? Is there compression?

Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1129185164&data=05%7C01%7Cjlentz%40sescollc.com%7Cff817567bc894fe80b2808da3832c26e%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637884087921638105%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CDGcGrAfVy5wYPSfUxArEYx39XRYvLhn%2Fcf9%2Bor%2BVBI%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WVVRVE63BOZWV44JS3VKPQFHANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7Cff817567bc894fe80b2808da3832c26e%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637884087921638105%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zR1JsjgeXoqP2Pwrzgst%2BWQaZjmJjPkBvIz0WxVxlzQ%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.@.>>

jlentzsesco avatar May 17 '22 20:05 jlentzsesco

@.*** pgbackrest]$ grep base/16399/39612_vm /opt/postgres/pgbackup/pgbackrest/backup/dayzerprd/20220514-010003F/backup.manifest pg_data/base/16399/39612_vm={"checksum":"958d083742eb0351a064906a76f5bb9bfebe39eb","repo-size":48050,"size":8978432,"timestamp":1648986341}

looks the same to me

@.*** Jasen M. Lentz, M.Ed Lead Systems Administrator Sesco Enterprises, LLC 4977 US Route 30 Greensburg, PA 15601 W: (724) 837-1991 x207 C: (412) 848-5612

From: David Steele @.> Sent: Tuesday, May 17, 2022 10:12 AM To: pgbackrest/pgbackrest @.> Cc: Jasen Lentz @.>; Author @.> Subject: Re: [pgbackrest/pgbackrest] Pgbackrest poor performance after upgrading from 2.35 to 2.38 (Issue #1746)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments or links unless you recognize the sender and know the content is safe.

You'd need to look in /opt/postgres/pgbackup/pgbackrest/backup/dayzerprd/20220514-010003F/backup.manifest for base/16399/39612_vm and see what the size/timestamp was when that full backup was made.

Reply to this email directly, view it on GitHubhttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpgbackrest%2Fpgbackrest%2Fissues%2F1746%23issuecomment-1128922802&data=05%7C01%7Cjlentz%40sescollc.com%7Cafbb69a8a36645b993c208da380f3de3%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883935368108677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CEbb1pIiD1Ggz2F2dNPe6kT8EagBxAbYspOr0MHBx94%3D&reserved=0, or unsubscribehttps://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAX4R5WVTU2D6MY37ITA7GULVKOSLZANCNFSM5VRXGRAQ&data=05%7C01%7Cjlentz%40sescollc.com%7Cafbb69a8a36645b993c208da380f3de3%7C56441138158c44b3a53acfe2455b5a86%7C0%7C0%7C637883935368108677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ByZn2Q0q5yAjaIAWqoHHkrkAs1mLduqLIGr8R98ndCs%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.@.>>

jlentzsesco avatar Oct 11 '22 08:10 jlentzsesco

It still seems the the C stat() function is returned unexpected results. Since we have not had any other reports of issues since this change I'm inclined to think it is something environmental.

One thing to try would be to put some debug code in and capture the file size that is being returned from stat(). If that value doesn't match what is in the manifest then I don't think there is anything we can do. If it does match then we'll have a look.

dwsteele avatar Dec 26 '22 04:12 dwsteele