wal-g icon indicating copy to clipboard operation
wal-g copied to clipboard

EOF Error During wal-g copy

Open benubois opened this issue 11 months ago • 2 comments

Database name

Postgres 11

Issue description

Describe your problem

When running wal-g copy, from R2 (Cloudflare's S3 compatible storage) to the local filesystem, wal-g reports an unexpected EOF error, then exits:

ERROR: 2025/01/10 18:30:17.143288 unable to copy data to file "/mnt/backup/wal-g/basebackups_005/base_0000000200031258000000FD/tar_partitions/part_115.tar.br": unexpected EOF       

It looks like this comes from folder.go:114:

https://github.com/wal-g/wal-g/blob/f20428f56559d2cb0eb18bca8d846b73273a20d5/pkg/storages/fs/folder.go#L108-L115

I've tried this three times and it failed each time, but on a different segment, so maybe it's retry related?

I'm able to successfully run backup-fetch on this backup.

Thanks for wal-g!

Please provide steps to reproduce

Using wal-g 3.0.4 on Ubuntu 20.04:

WALG_LOG_LEVEL=DEVEL wal-g copy --turbo --backup-name LATEST --from=copy_from.json --to=copy_to.json &> /tmp/walg.log

cat copy_from.json
{
  "AWS_ACCESS_KEY_ID": "***",
  "AWS_ENDPOINT": "https://r2.cloudflarestorage.com",
  "AWS_REGION": "auto",
  "AWS_SECRET_ACCESS_KEY": "***",
  "WALG_S3_PREFIX": "s3://***"
}

cat copy_to.json
{
  "WALG_FILE_PREFIX": "/mnt/backup/wal-g"
}

Logs

DEBUG: 2025/01/10 18:29:22.813426 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_115.tar.br reader
DEBUG: 2025/01/10 18:29:22.813439 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_115.tar.br into 
DEBUG: 2025/01/10 18:29:22.961664 HTTP response code: 200
DEBUG: 2025/01/10 18:29:22.961693 request GET response: 735356676 request: 0
DEBUG: 2025/01/10 18:29:22.961812 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1150.tar.br reader
DEBUG: 2025/01/10 18:29:22.961833 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1150.tar.br into 
INFO: 2025/01/10 18:29:39.852987 Copied 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1138.tar.br' from folder 'wal-g/' to 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1138.tar.br' in fodler ''.
INFO: 2025/01/10 18:29:40.533724 Copied 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1148.tar.br' from folder 'wal-g/' to 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1148.tar.br' in fodler ''.
DEBUG: 2025/01/10 18:29:40.727340 HTTP response code: 200
DEBUG: 2025/01/10 18:29:40.727367 request GET response: 739756879 request: 0
DEBUG: 2025/01/10 18:29:40.727486 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1152.tar.br reader
DEBUG: 2025/01/10 18:29:40.727513 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1152.tar.br into 
DEBUG: 2025/01/10 18:29:40.997574 HTTP response code: 200
DEBUG: 2025/01/10 18:29:40.997606 request GET response: 748661721 request: 0
DEBUG: 2025/01/10 18:29:40.997762 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1151.tar.br reader
DEBUG: 2025/01/10 18:29:40.997786 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1151.tar.br into 
INFO: 2025/01/10 18:29:50.488169 Copied 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1144.tar.br' from folder 'wal-g/' to 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1144.tar.br' in fodler ''.
DEBUG: 2025/01/10 18:29:50.753996 HTTP response code: 200
DEBUG: 2025/01/10 18:29:50.754030 request GET response: 731013926 request: 0
DEBUG: 2025/01/10 18:29:50.754144 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1153.tar.br reader
DEBUG: 2025/01/10 18:29:50.754169 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1153.tar.br into 
INFO: 2025/01/10 18:29:55.411925 Copied 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1147.tar.br' from folder 'wal-g/' to 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1147.tar.br' in fodler ''.
INFO: 2025/01/10 18:29:55.481981 Copied 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1140.tar.br' from folder 'wal-g/' to 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1140.tar.br' in fodler ''.
DEBUG: 2025/01/10 18:29:55.671837 HTTP response code: 200
DEBUG: 2025/01/10 18:29:55.671868 request GET response: 759369840 request: 0
DEBUG: 2025/01/10 18:29:55.671990 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1155.tar.br reader
DEBUG: 2025/01/10 18:29:55.672013 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1155.tar.br into 
DEBUG: 2025/01/10 18:29:55.688429 HTTP response code: 200
DEBUG: 2025/01/10 18:29:55.688456 request GET response: 734843317 request: 0
DEBUG: 2025/01/10 18:29:55.688580 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1154.tar.br reader
DEBUG: 2025/01/10 18:29:55.688608 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1154.tar.br into 
INFO: 2025/01/10 18:30:11.082279 Copied 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1149.tar.br' from folder 'wal-g/' to 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1149.tar.br' in fodler ''.
DEBUG: 2025/01/10 18:30:11.278822 HTTP response code: 200
DEBUG: 2025/01/10 18:30:11.278861 request GET response: 729428233 request: 0
DEBUG: 2025/01/10 18:30:11.279074 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1156.tar.br reader
DEBUG: 2025/01/10 18:30:11.279103 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1156.tar.br into 
INFO: 2025/01/10 18:30:12.208797 Copied 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1152.tar.br' from folder 'wal-g/' to 'basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1152.tar.br' in fodler ''.
DEBUG: 2025/01/10 18:30:12.387338 HTTP response code: 200
DEBUG: 2025/01/10 18:30:12.387364 request GET response: 742703073 request: 0
DEBUG: 2025/01/10 18:30:12.387455 fetched object basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1157.tar.br reader
DEBUG: 2025/01/10 18:30:12.387472 Put basebackups_005/base_0000000200031258000000FD/tar_partitions/part_1157.tar.br into 
ERROR: 2025/01/10 18:30:17.143193 unable to copy data to file "/mnt/backup/wal-g/basebackups_005/base_0000000200031258000000FD/tar_partitions/part_115.tar.br": unexpected EOF
DEBUG: 2025/01/10 18:30:17.143244 unable to copy data to file "/mnt/backup/wal-g/basebackups_005/base_0000000200031258000000FD/tar_partitions/part_115.tar.br": unexpected EOF
ERROR: 2025/01/10 18:30:17.143288 unable to copy data to file "/mnt/backup/wal-g/basebackups_005/base_0000000200031258000000FD/tar_partitions/part_115.tar.br": unexpected EOF

benubois avatar Jan 10 '25 19:01 benubois

Hi! Ugh... we do not have retries for filesystem... do you have enough disk quota?

x4m avatar Jan 11 '25 16:01 x4m

Yup! That partition is 2TB, the backup is around 800GB.

benubois avatar Jan 12 '25 14:01 benubois