pve-patches icon indicating copy to clipboard operation
pve-patches copied to clipboard

Performance

Open pdirksen opened this issue 6 years ago • 41 comments

Can we do something about the performance? Currently we use a CIFS/SMB volume via a 1Gbit/s interface. A full backup for a medium sized VM needs about 2mins whereas it needs about 30mins to finish using xdelta3.

INFO: status: 70% (15040643072/21474836480), sparse 25% (5543481344), duration 1478, read/write 6/5 MB/s

As xdelta3 is only able to use one thread combined with a medium compression this is probably the bottleneck.

pdirksen avatar Mar 19 '18 15:03 pdirksen

Agree

jadsolucions avatar Jul 10 '18 07:07 jadsolucions

Is xdelta3 not threaded at all? That's where I see the bottleneck as well. I have VMs that are 200+GB and can do full backups very quickly, but a differential takes nearly all day because it's stuck in xdelta3 maxing out a single CPU core and not spreading the work out.

sienar1 avatar Jul 23 '18 14:07 sienar1

it's not a multithread problem but a "network problem", really it's vzdump the big problem, when you'r backup start, it use your cifs/smb to read the old backup and doing a double write in same time : the vm snapshot in a dat/tmp file to compare with old backup and the delta vcdiff file...

Kalimeiro avatar Jul 25 '18 05:07 Kalimeiro

I have to disagree. I can run the same differential backups of large VMs on local storage (storage that can handle over 1GB/s of throughput) and the differential backup runs at about 4MB/s, checking CPU usage you can see xdelta3 running on only 1 core, for hours (18 hours for a 200GB VM specifically). It appears to be xdelta3 very poorly threaded. I've also found multiple other support threads where xdelta3 has been used in commercial products and they suffer the same issue. If you have a server with many slower cores (such as my Xeon E5-2650L based server), these differential backups are near useless for any large source VMs which is exactly where you need it to perform well.

sienar1 avatar Jul 25 '18 13:07 sienar1

I have to disagree. I can run the same differential backups of large VMs on local storage.

You say in local storage, but the problem is when using CIFS/SMB storage, it read the old backup from CIFS/SMB + it write the actual snapshot in a dat/tmp to the CIFS/SMB and then compare the old backup and the actual snapshot to write the vcdiff file... (double write CIFS/SMB + READ CIFS/SMB) ... it's a design problem that comes from vzdump in addition to the ayufan patch (which is not reactive at all)

I agree with local storage, no problem...

Kalimeiro avatar Dec 12 '18 09:12 Kalimeiro

Hi I have issues with this too... Before apply this patch, backup tooks 7 hours... After the path, is took more than 18 hours! And the servers all of them has 16 cores.... or more... I am using NFS as storage... Something definitely is very wrong.

gilbertoferreira avatar Aug 12 '19 13:08 gilbertoferreira

Hi I have issues with this too... Before apply this patch, backup tooks 7 hours... After the path, is took more than 18 hours! And the servers all of them has 16 cores.... or more... I am using NFS as storage... Something definitely is very wrong.

Hi, I have same issue.

jhusarek avatar Aug 23 '19 09:08 jhusarek

Hi there! I have the same issue here too! The file is, indeed, reduced but slow down all vzdump process... How can we do something to improve this?

gilbertoferreira avatar Aug 23 '19 11:08 gilbertoferreira

Hi all,

here having the same problem when differential backup is happening

I'm using the patch for 5.4-5.

normal backups are working at normal speed and differential backups speed drops drastically.

All backups goes to same NFS storage.

alebeta90 avatar Aug 27 '19 11:08 alebeta90

I did a few test on 5GB .vma files. If you add -3 -B 2147483648 options to xdelta3 you should get noticeably smaller diff copies. Disadvantage is more memory consumption. About speed. xdelta3 is single threaded, it uses gzip which is also single threaded. If you find way to change gzip to pigz you backup should finish a little faster.

marcin-github avatar Aug 30 '19 23:08 marcin-github

Perhaps we shoud compile more new xdelta3 version... The one in github is bit older, 3.0.6 I thing... There's a new version here: https://github.com/jmacd/xdelta Which I thing is 3.1 or something

And here too:

http://xdelta.org/


Gilberto Nunes Ferreira

(47) 3025-5907 (47) 99676-7530 - Whatsapp / Telegram

Skype: gilberto.nunes36

Em sex, 30 de ago de 2019 às 20:20, Marcin [email protected] escreveu:

I did a few test on 5GB .vma files. If you add -3 -B 2147483648 options to xdelta3 you should get noticeably smaller diff copies. Disadvantage is more memory consumption. About speed. xdelta3 is single threaded, it uses gzip which is also single threaded. If you find way to change gzip to pigz you backup should finish a little faster.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ayufan/pve-patches/issues/1?email_source=notifications&email_token=ACP2OBE2BXWXTTVRA5KSSFTQHGTKNA5CNFSM4EWBTU72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5S7P5Y#issuecomment-526776311, or mute the thread https://github.com/notifications/unsubscribe-auth/ACP2OBH5KPJQIWXWITPKQADQHGTKNANCNFSM4EWBTU7Q .

gilbertoferreira avatar Aug 30 '19 23:08 gilbertoferreira

xdelta3 3.0.11 exists but is not downloaded by the patch installer. See: #34

KlugFR avatar Dec 01 '19 13:12 KlugFR

Have the same performance issue here, really small machine takes ways to long for a differential backup. Already running xdelta3 3.0.11, on backup routine it only takes one cpu core. Backup target is a cifs storage service (storage box from hetzner.com), full backups just takes 1 or 2 minutes and working properly.

I like the differential solution, but with that bad performance I can't let it run on my other proxmox servers with larger vms. Is there any advice how to optimize it?

Backup log:

INFO: starting new backup job: vzdump 101 --all 0 --compress lzo --mailnotification failure --maxfiles 30 --mode snapshot --quiet 1 --mailto [email protected] --fullbackup 30 --node pm103 --storage backup
INFO: doing differential backup against '/mnt/pve/backup/dump/vzdump-qemu-101-2019_12_20-08_45_02.vma.lzo'
INFO: Starting Backup of VM 101 (qemu)
INFO: Backup started at 2019-12-20 08:47:17
INFO: status = running
INFO: update VM 101: -lock backup
INFO: VM Name: fw03.xx.xx.xx
INFO: include disk 'scsi0' 'data:vm-101-disk-0' 60G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/backup/dump/vzdump-qemu-101-2019_12_20-08_45_02.vma.lzo--differential-2019_12_20-08_47_17.vcdiff'
INFO: started backup task '3033d581-542d-4f46-9ce6-d939874c7524'
INFO: status: 10% (6830620672/64424509440), sparse 10% (6820663296), duration 4, read/write 1707/2 MB/s
INFO: status: 11% (7492009984/64424509440), sparse 11% (7237709824), duration 38, read/write 19/7 MB/s
INFO: status: 14% (9460973568/64424509440), sparse 14% (9090572288), duration 57, read/write 103/6 MB/s
INFO: status: 20% (13451329536/64424509440), sparse 20% (13062885376), duration 61, read/write 997/4 MB/s
INFO: status: 23% (15379267584/64424509440), sparse 23% (14967386112), duration 65, read/write 481/5 MB/s
INFO: status: 24% (15470493696/64424509440), sparse 23% (14968123392), duration 77, read/write 7/7 MB/s
INFO: status: 25% (16109076480/64424509440), sparse 23% (15011840000), duration 168, read/write 7/6 MB/s
INFO: status: 26% (16755261440/64424509440), sparse 23% (15303970816), duration 208, read/write 16/8 MB/s
INFO: status: 27% (17412849664/64424509440), sparse 24% (15700697088), duration 236, read/write 23/9 MB/s
INFO: status: 28% (18059034624/64424509440), sparse 24% (16054276096), duration 264, read/write 23/10 MB/s
INFO: status: 29% (18686214144/64424509440), sparse 25% (16454565888), duration 282, read/write 34/12 MB/s
INFO: status: 30% (19336200192/64424509440), sparse 26% (16785715200), duration 307, read/write 25/12 MB/s
INFO: status: 31% (19982385152/64424509440), sparse 26% (17112711168), duration 385, read/write 8/4 MB/s
INFO: status: 32% (20620967936/64424509440), sparse 26% (17158856704), duration 566, read/write 3/3 MB/s
INFO: status: 33% (21270953984/64424509440), sparse 27% (17622482944), duration 612, read/write 14/4 MB/s
INFO: status: 37% (23984472064/64424509440), sparse 31% (20185636864), duration 650, read/write 71/3 MB/s
INFO: status: 51% (33088536576/64424509440), sparse 45% (29283241984), duration 653, read/write 3034/2 MB/s
INFO: status: 53% (34527707136/64424509440), sparse 47% (30698168320), duration 656, read/write 479/8 MB/s
INFO: status: 66% (43034148864/64424509440), sparse 60% (39193022464), duration 659, read/write 2835/3 MB/s
INFO: status: 75% (48851648512/64424509440), sparse 69% (44998873088), duration 662, read/write 1939/3 MB/s
INFO: status: 84% (54398222336/64424509440), sparse 78% (50537148416), duration 665, read/write 1848/2 MB/s
INFO: status: 94% (61080600576/64424509440), sparse 88% (57219432448), duration 668, read/write 2227/0 MB/s
INFO: status: 100% (64424509440/64424509440), sparse 94% (60563333120), duration 670, read/write 1671/0 MB/s
INFO: transferred 64424 MB in 670 seconds (96 MB/s)
INFO: archive file size: 1.64GB
INFO: Finished Backup of VM 101 (00:11:11)
INFO: Backup finished at 2019-12-20 08:58:28
INFO: Backup job finished successfully
TASK OK

ScIT-Raphael avatar Dec 20 '19 07:12 ScIT-Raphael

My guess is that if you want multi-core, you need to use LZMA and a multi-core compiled version of LZMA.

KlugFR avatar Dec 20 '19 08:12 KlugFR

Thanks for the answer @KlugFR, is there any docs how this can be done?

ScIT-Raphael avatar Dec 27 '19 11:12 ScIT-Raphael

Hello, Confirmed on a vm of 100gb. 10minutes for full backup, few hours for differential. Thanks

vdeville avatar Jun 02 '20 12:06 vdeville

Nothing new about how to resolve the issue?

ScIT-Raphael avatar Jun 02 '20 13:06 ScIT-Raphael

@MyTheValentinus @ScIT-Raphael Maybe try it with the recently released zstd?

ayufan avatar Jun 02 '20 13:06 ayufan

Hello, I use zstd on this server. Maybe problem with zstd ? Recently dist upgrade this proxmox (15 days ago) Thanks

Sent from my iPhone

On 2 Jun 2020, at 15:40, Kamil Trzciński [email protected] wrote:

 @MyTheValentinus @ScIT-Raphael Maybe try it with the recently released zstd?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

vdeville avatar Jun 02 '20 14:06 vdeville

Standard backup zstd mode snapshots: 110-130 mb/s Differential backup on same target: 6-10 mb/s

vdeville avatar Jun 02 '20 15:06 vdeville

Completely the same situation. In this current state, it's unusable. I'm using the latest patch with the latest pve-xdelta3 3.0.11

JoeApo108 avatar Jun 04 '20 12:06 JoeApo108

Same here, latest verison, still slow as hell :(.

ScIT-Raphael avatar Jun 04 '20 13:06 ScIT-Raphael

Hmm. Is the xdelta3 to be single threaded? Can you show CPU usage of individual processes?

ayufan avatar Jun 04 '20 13:06 ayufan

When i look the cpu usage, no core are 100%, i'm not sure that is linked to the single thread of xdelta. Before, other old version work fine in single core.

vdeville avatar Jun 04 '20 13:06 vdeville

Testing with edited /etc/vzdump.conf Added zstd: 0 (which utilized half of all available cores)

After this, I can see 28 cores in use out of 56. Before it was just 1. Backup still in process at the moment. Will keep you informed.

JoeApo108 avatar Jun 04 '20 14:06 JoeApo108

@JoeApo108 Do you already have some feedback? Went the backup trough properly and fast?

ScIT-Raphael avatar Jun 04 '20 18:06 ScIT-Raphael

@JoeApo108 Do you already have some feedback? Went the backup trough properly and fast?

No, it didn't help at all...still slow. The full backup of 320GB took 1h40m. The diff is ongoing 6h and still processing.

Ok, it might have smth to do with the source window size (-B switch). When I tested it with LXC of the tiny size than the full backup took (1.4GiB, 29MiB/s, archive size 410MB) and diff backup (1.4GiB, 23MiB/s, archive size 700kB)

The huge LXC full backup took (320GiB, 52MiB/s, archive size 160GB) and diff ( ???? still in process)

JoeApo108 avatar Jun 04 '20 19:06 JoeApo108

Hmm. Is the xdelta3 to be single threaded? Can you show CPU usage of individual processes?

https://imgur.com/a/bgB88ix

JoeApo108 avatar Jun 04 '20 19:06 JoeApo108

Hello, Any news ? Thanks

vdeville avatar Jun 10 '20 10:06 vdeville

We're having the same problem, using Proxmox v6.2-4 and the latest xdelta3.

ZSTD full backup is quick, differential against it takes hours and hours until it basically stalls. Any solutions?

umm0n avatar Jun 17 '20 23:06 umm0n