sgkit icon indicating copy to clipboard operation
sgkit copied to clipboard

Some results of Allele Frequency Distributed Benchmark

Open LiangdeLI opened this issue 3 years ago • 1 comments

Here is some results of Allele Frequency's distributed benchmark on 1000 Genome data:

AF_1kg_8x5 AF_1kg_40x1

I set up the experiment with [1,2,4,8] machines, data are stored in HDFS running on the nodes respectively, block size is 64 MB.

Data chunk settings: The default chunk size of data is 10000x1000(Variants x Samples), so basically sharding in tiles. I also do experiments on row sharding: 4000x2504, as it has the same size of each chunk with default sharding, 10000x2504 to explore how sizes of chunks affect runtime. I tried 1000x2504 as well, but it seems that workers die silently in this case. I believe Hail also do row shardings, so genotype data of a specific variant from all samples appear in one chunk/partition.

Dask scheduler: Only the gray lines in the plots, "Single Node Average", uses single machine scheduler, this is data from single node experiment to be used as a baseline in distributed experiments. I think the single machine scheduler was created in the same process as Sgkit program running, also the worker is created in that same process as well. I believe it has 1 worker 40 threads. The other experiments and runtime are from distributed scheduler.

Dask workers settings: Because each machine I'm using have 40 vCores, I primarily tested two settings:

  1. 8 workers on each machine, each worker has 5 threads. This is the auto mode configuration (dask-worker --nprocs auto). Note this is not the default mode, the default is 1 worker 40 threads, but will run endlessly on 1 or 2 nodes.
  2. 40 workers on each machine, each worker has 1 threads. Because this setting seems perform well in 1/2 nodes

The following plot is performance with maximum computation resource I get (8 nodes, 320 vCores, 160 physical cores): AF_8nodes_diff_shard

LiangdeLI avatar Feb 16 '22 22:02 LiangdeLI

Very nice @LiangdeLI, this is an informative analysis!

jeromekelleher avatar Feb 17 '22 20:02 jeromekelleher