clickhouse-backup icon indicating copy to clipboard operation
clickhouse-backup copied to clipboard

Improve performance for download or restore remote backup

Open malcolm061990 opened this issue 3 years ago • 10 comments

May be it's related to #142 but anyway the download/restore from remote speed is slowly then upload/create_remote speed.

Here is the configuration:

general:
  remote_storage: s3
  max_file_size: 1099511627776
  disable_progress_bar: false
  backups_to_keep_local: 2
  backups_to_keep_remote: 30
  log_level: info
  allow_empty_backups: false
  download_concurrency: 255
  upload_concurrency: 255
clickhouse:
  username: default
  password: ""
  host: ip-10-40-2-21
  port: 9000
  disk_mapping: {}
  skip_tables:
  - system.*
  timeout: 5m
  freeze_by_part: false
  secure: false
  skip_verify: false
  sync_replicated_tables: true
  skip_sync_replica_timeouts: true
  log_sql_queries: false
s3:
  access_key: access_key
  secret_key: secret_key
  bucket: bucket
  path: path
  endpoint: ""
  region: us-east-1
  acl: private
  force_path_style: false
  disable_ssl: false
  part_size: 0
  compression_level: 1
  compression_format: tar
  sse: ""
  disable_cert_verification: false
  storage_class: STANDARD
  concurrency: 255
api:
  listen: localhost:7171
  enable_metrics: true
  enable_pprof: false
  username: ""
  password: ""
  secure: false
  certificate_file: ""
  private_key_file: ""
  create_integration_tables: false

The test dataset is almost 14 Gb. create_remote works in parallel, cool and fast:

time clickhouse-backup --config /opt/clickhouse/clickhouse-backup/config.yml create_remote full_ch_backup_2021-11-04-TEST
...
2021/11/04 06:10:59  info done                      backup=full_ch_backup_2021-11-04-TEST duration=17.511s operation=upload size=13.91GiB

real	0m18.537s
user	1m54.922s
sys	0m28.236s

But download/restore from remote takes much more time than uploading to remote. As you see its longer in almost 6 times:

time clickhouse-backup --config /opt/clickhouse/clickhouse-backup/config.yml download full_ch_backup_2021-11-04-TEST
...
2021/11/04 06:16:00  info done                      backup=full_ch_backup_2021-11-04-TEST duration=1m53.003s operation=download size=13.89GiB

real	1m53.025s
user	0m20.756s
sys	0m38.276s

I changed download_concurrency and part_size to different values - no effect. How can the download speed be increased?

malcolm061990 avatar Nov 04 '21 10:11 malcolm061990

I really appreciate for your detailed reporting and feedback

Could you provide more context? Which clickhouse-backup version do you use for benchmark? Which environment do you use for benchmark? Do you use AWS S3 or maybe other implementations like Minio?

Be careful with big numbers S3_CONCURRENCY with combination DOWNLOAD_CONCURRENCY / UPLOAD_CONCURRENCY. It could allocates a lot of memory for buffers.

Difference between UPLOAD and DOWNLOAD process from parallelization point of view during upload

  • for each table parallel pool -> each data part parallel pool -> upload to one db/table/disk_name_{num}.tar file + parallel upload db/table/metatada.json

during download

  • sequentially download metadata.json for each table (point for optimization in near future!!!)
  • for each table parallel pool -> for each disk_name_X.tar file pool donwload and unpack on stream (look like could be optimize with github.com/aws/aws-sdk-go/service/s3/s3manager NewDownloader instead of GetObjectRequest

Slach avatar Nov 04 '21 12:11 Slach

Which clickhouse-backup version do you use for benchmark?

clickhouse-backup -v
Version:	 1.2.1
Git Commit:	 38cac6b647f46c3e076650d574eb1f2fb8c3ecf0
Build Date:	 2021-10-30

Which environment do you use for benchmark? Do you use AWS S3 or maybe other implementations like Minio?

I use AWS S3.

Be careful with big numbers S3_CONCURRENCY with combination DOWNLOAD_CONCURRENCY / UPLOAD_CONCURRENCY. It could allocates a lot of memory for buffers.

Thanks for that. For example, I have 16 CPU with 128 GB RAM machines. What are recommended values for S3_CONCURRENCY, DOWNLOAD_CONCURRENCY and UPLOAD_CONCURRENCY?

during download

  • sequentially download metadata.json for each table (point for optimization in near future!!!)
  • for each table parallel pool -> for each disk_name_X.tar file pool donwload and unpack on stream (look like could be optimize with github.com/aws/aws-sdk-go/service/s3/s3manager NewDownloader instead of GetObjectRequest

metadata.json files are pretty small files, so I think its not a problem to download them. As I see here is that a file for big table is downloaded slowly and not in parallel. For example: we have a "big" 7 GB table. On uploading on remote, the progress bar shows this in parallel mode so its very fast, but on downloading the progress bar shows other files are downloaded in parallel but this big file is downloaded sequentially. Hope my explanation is clear for you :)

malcolm061990 avatar Nov 04 '21 12:11 malcolm061990

I have 16 CPU with 128 GB RAM machines. What are recommended values for S3_CONCURRENCY, DOWNLOAD_CONCURRENCY and UPLOAD_CONCURRENCY?

Check how much memory allocated and use DOWNLOAD_CONCURRENCY=8 UPLOAD_CONCURRENCY=8 S3_CONCURRENCY=4 and compare results

Slach avatar Nov 04 '21 12:11 Slach

Check how much memory allocated and use DOWNLOAD_CONCURRENCY=8 UPLOAD_CONCURRENCY=8 S3_CONCURRENCY=4 and compare results

Ok, will check it.

Sorry, I edited my last answer adding: metadata.json files are pretty small files, so I think its not a problem to download them. As I see here is that a file for big table is downloaded slowly and not in parallel. For example: we have a "big" 7 GB table. On uploading on remote, the progress bar shows this in parallel mode so its very fast, but on downloading the progress bar shows other files are downloaded in parallel but this big file is downloaded sequentially. Hope my explanation is clear for you :)

malcolm061990 avatar Nov 04 '21 12:11 malcolm061990

@malcolm061990 any results with lower concurrency numbers?

I tried to apply multipart concurrency download implementation, unfortunately it requires allocating additional disk space during download, and we can't apply in-memory streaming decompression

You can try to combine clickhouse-backup create

and rclone sync, look to https://rclone.org for details or use https://github.com/restic/restic to try to incremental backup for /var/lib/clickhouse/backup/ folder

Slach avatar Nov 08 '21 10:11 Slach

@malcolm061990 any results with lower concurrency numbers?

I tried to apply multipart concurrency download implementation, unfortunately it requires allocating additional disk space during download, and we can't apply in-memory streaming decompression

You can try to combine clickhouse-backup create

and rclone sync, look to https://rclone.org for details or use https://github.com/restic/restic to try to incremental backup for /var/lib/clickhouse/backup/ folder

Sorry, for now I can't test the speed because of our CH is under load test. Will back to that soon, thanks But why does it require additional disk space during download?

malcolm061990 avatar Nov 08 '21 10:11 malcolm061990

But why does it require additional disk space during download?

Currently, we use pool of parallel go-routines, each go-routine download one s3://bucket-name/path/backup_name/db/table/disk_name.tar file.

S3 allow us to use internal library s3manager.NewDownloader to allow multipart concurrently download, but it need need provide variable "writer" which provide WriteAt() method which could be implemented properly only in os.File type, which allocate disk space, otherwise we will have to allocate much memory to download the entire archive file.

Only one method I see here, we need to change remote storage file format and upload files directly or create archives for each data part instead of each table.

Slach avatar Nov 08 '21 11:11 Slach

But why does it require additional disk space during download?

Currently, we use pool of parallel go-routines, each go-routine download one s3://bucket-name/path/backup_name/db/table/disk_name.tar file.

S3 allow us to use internal library s3manager.NewDownloader to allow multipart concurrently download, but it need need provide variable "writer" which provide WriteAt() method which could be implemented properly only in os.File type, which allocate disk space, otherwise we will have to allocate much memory to download the entire archive file.

Only one method I see here, we need to change remote storage file format and upload files directly or create archives for each data part instead of each table.

Thanks for the explanation. For sure it's not a good idea to allocate additional disk space during download.

we need to change remote storage file format and upload files directly

Good idea but its not clear :) What do you mean?

malcolm061990 avatar Nov 08 '21 13:11 malcolm061990

we need to change remote storage file format and upload files directly

Good idea but its not clear :) What do you mean?

Currently each table data create one archive on s3://backup-bucket/path/backup_name/db/table/disk_name.archive.extension

We can try to create archive for each data part (each system.parts element) instead of each table

Slach avatar Nov 08 '21 13:11 Slach

we need to change remote storage file format and upload files directly

Good idea but its not clear :) What do you mean?

Currently each table data create one archive on s3://backup-bucket/path/backup_name/db/table/disk_name.archive.extension

We can try to create archive for each data part (each system.parts element) instead of each table

If it doesn't break anything it will be cool

malcolm061990 avatar Nov 08 '21 13:11 malcolm061990

@malcolm061990 now, 1.5.x version released which have upload_by_parts: true and download_by_parts: true in general section

could you try it and compare download benchmark? currently, i close issue after inactivity

but please comment issue if you have any information about performance comparsion

Slach avatar Aug 26 '22 05:08 Slach