scylla-machine-image icon indicating copy to clipboard operation
scylla-machine-image copied to clipboard

aws: add i8g instance type

Open syuu1228 opened this issue 8 months ago • 2 comments

Adding preset io parameters of i8g to scylla_cloud_io_setup, and also added i8g to supported instance type on aws_instance class.

All preset values are measured by iotune on target instances.

Here's measurement environment details:

  • Measured on i8g.* instances with latest version of Ubuntu 24.04 LTS AMI (We cannot use Scylla AMI since we do want to measure single drive performance)
  • Measured single local SSD w/o RAID0, since we simulate RAID0 performance on scylla_cloud_io_setup script from single drive performance
  • Use iotune for the measurement, executed 3 times for each instance size and used average of the results
  • Automated measurement by script: https://github.com/syuu1228/ec2_run_script

Here's raw output of iotune:

  • i8g.large (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 399 MB/s (deviation 14%) Measuring sequential read bandwidth: 577 MB/s (deviation 43%) Measuring random write IOPS: 45598 IOPS (deviation 31%) Measuring random read IOPS: 82269 IOPS (deviation 29%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.large (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 399 MB/s (deviation 14%) Measuring sequential read bandwidth: 577 MB/s (deviation 43%) Measuring random write IOPS: 45601 IOPS (deviation 31%) Measuring random read IOPS: 82266 IOPS (deviation 29%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.large (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 399 MB/s (deviation 14%) Measuring sequential read bandwidth: 577 MB/s (deviation 43%) Measuring random write IOPS: 45596 IOPS (deviation 31%) Measuring random read IOPS: 82269 IOPS (deviation 29%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.xlarge (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 799 MB/s (deviation 14%) Measuring sequential read bandwidth: 1160 MB/s (deviation 43%) Measuring random write IOPS: 90444 IOPS (deviation 19%) Measuring random read IOPS: 164432 IOPS (deviation 29%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.xlarge (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 799 MB/s (deviation 14%) Measuring sequential read bandwidth: 1160 MB/s (deviation 43%) Measuring random write IOPS: 90490 IOPS (deviation 20%) Measuring random read IOPS: 164456 IOPS (deviation 29%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.xlarge (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 799 MB/s (deviation 14%) Measuring sequential read bandwidth: 1160 MB/s (deviation 43%) Measuring random write IOPS: 90405 IOPS (deviation 19%) Measuring random read IOPS: 164439 IOPS (deviation 29%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.2xlarge (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 1609 MB/s (deviation 14%) Measuring sequential read bandwidth: 2325 MB/s (deviation 41%) Measuring random write IOPS: 135841 IOPS Measuring random read IOPS: 328308 IOPS (deviation 23%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.2xlarge (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 1609 MB/s (deviation 14%) Measuring sequential read bandwidth: 2325 MB/s (deviation 41%) Measuring random write IOPS: 133803 IOPS (deviation 3%) Measuring random read IOPS: 328216 IOPS (deviation 22%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.2xlarge (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 1609 MB/s (deviation 14%) Measuring sequential read bandwidth: 2325 MB/s (deviation 41%) Measuring random write IOPS: 133992 IOPS Measuring random read IOPS: 328302 IOPS (deviation 23%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.4xlarge (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3260 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 141234 IOPS Measuring random read IOPS: 554281 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.4xlarge (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 140699 IOPS Measuring random read IOPS: 552544 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.4xlarge (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 140510 IOPS Measuring random read IOPS: 553998 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.8xlarge (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 8%) Measuring sequential read bandwidth: 4574 MB/s (deviation 26%) Measuring random write IOPS: 139456 IOPS Measuring random read IOPS: 528086 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.8xlarge (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 8%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 140425 IOPS Measuring random read IOPS: 527157 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.8xlarge (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 9%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 139128 IOPS Measuring random read IOPS: 527045 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.12xlarge (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3263 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 139198 IOPS Measuring random read IOPS: 512264 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.12xlarge (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 11%) Measuring sequential read bandwidth: 4566 MB/s (deviation 25%) Measuring random write IOPS: 139031 IOPS Measuring random read IOPS: 509359 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.12xlarge (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 139333 IOPS Measuring random read IOPS: 512320 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.16xlarge (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 140104 IOPS Measuring random read IOPS: 506129 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.16xlarge (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 139681 IOPS Measuring random read IOPS: 504401 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.16xlarge (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3261 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 140213 IOPS Measuring random read IOPS: 506485 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.24xlarge (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3265 MB/s (deviation 14%) Measuring sequential read bandwidth: 4574 MB/s (deviation 26%) Measuring random write IOPS: 139638 IOPS Measuring random read IOPS: 517821 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.24xlarge (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3262 MB/s (deviation 10%) Measuring sequential read bandwidth: 4574 MB/s (deviation 26%) Measuring random write IOPS: 140312 IOPS Measuring random read IOPS: 517178 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.24xlarge (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3265 MB/s (deviation 14%) Measuring sequential read bandwidth: 4574 MB/s (deviation 26%) Measuring random write IOPS: 140032 IOPS Measuring random read IOPS: 519382 IOPS (deviation 3%) Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.metal-24xl (0/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3260 MB/s (deviation 10%) Measuring sequential read bandwidth: 4569 MB/s (deviation 25%) Measuring random write IOPS: 138682 IOPS Measuring random read IOPS: 513462 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.metal-24xl (1/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3263 MB/s (deviation 10%) Measuring sequential read bandwidth: 4575 MB/s (deviation 26%) Measuring random write IOPS: 139715 IOPS Measuring random read IOPS: 510749 IOPS Writing result to /etc/scylla.d/io_properties.yaml

  • i8g.metal-24xl (2/3) Starting Evaluation. This may take a while... Measuring sequential write bandwidth: 3260 MB/s (deviation 9%) Measuring sequential read bandwidth: 4555 MB/s (deviation 24%) Measuring random write IOPS: 139076 IOPS Measuring random read IOPS: 511639 IOPS Writing result to /etc/scylla.d/io_properties.yaml

Fixes #560

syuu1228 avatar Mar 19 '25 20:03 syuu1228

@roydahan according to @syuu1228 this one can be merged, is that ok from your POV?

yaronkaikov avatar Apr 07 '25 10:04 yaronkaikov

These are the specs for i8g published by AWS:

Instance type Instance store volumes Instance store type 100% random read IOPS / Write IOPS Needs initialization 1 TRIM support 2
i8g.large 1 x 468 GB NVMe SSD 75,000 / 41,250  
i8g.xlarge 1 x 937 GB NVMe SSD 150,000 / 82,500  
i8g.2xlarge 1 x 1875 GB NVMe SSD 300,000 / 165,000  
i8g.4xlarge 1 x 3750 GB NVMe SSD 600,000 / 330,000  
i8g.8xlarge 2 x 3750 GB NVMe SSD 1,200,000 / 660,000  
i8g.12xlarge 3 x 3750 GB NVMe SSD 1,800,000 / 990,000  
i8g.16xlarge 4 x 3750 GB NVMe SSD 2,400,000 / 1,320,000  
i8g.24xlarge 6 x 3750 GB NVMe SSD 3,600,000 / 1,980,000  
i8g.metal-24xl 6 x 3750 GB NVMe SSD 3,600,000 / 1,980,000  

It seems like our numbers aren't aligned with these once we reach to 4xlarge, both for read IOPS, but especially for write IOPS which looks like reaching a limit. Please open an issue about it in seastar. I don't think we can merge it like that.

roydahan avatar Apr 07 '25 16:04 roydahan

Revving this issue. Where are we with this item?

mykaul avatar May 20 '25 09:05 mykaul

These are the specs for i8g published by AWS:

Instance type Instance store volumes Instance store type 100% random read IOPS / Write IOPS Needs initialization 1 TRIM support 2 i8g.large 1 x 468 GB NVMe SSD 75,000 / 41,250   ✓ i8g.xlarge 1 x 937 GB NVMe SSD 150,000 / 82,500   ✓ i8g.2xlarge 1 x 1875 GB NVMe SSD 300,000 / 165,000   ✓ i8g.4xlarge 1 x 3750 GB NVMe SSD 600,000 / 330,000   ✓ i8g.8xlarge 2 x 3750 GB NVMe SSD 1,200,000 / 660,000   ✓ i8g.12xlarge 3 x 3750 GB NVMe SSD 1,800,000 / 990,000   ✓ i8g.16xlarge 4 x 3750 GB NVMe SSD 2,400,000 / 1,320,000   ✓ i8g.24xlarge 6 x 3750 GB NVMe SSD 3,600,000 / 1,980,000   ✓ i8g.metal-24xl 6 x 3750 GB NVMe SSD 3,600,000 / 1,980,000   ✓ It seems like our numbers aren't aligned with these once we reach to 4xlarge, both for read IOPS, but especially for write IOPS which looks like reaching a limit. Please open an issue about it in seastar. I don't think we can merge it like that.

@syuu1228 please handle this, let's try to get it in by the end of this week

yaronkaikov avatar May 28 '25 06:05 yaronkaikov

These are the specs for i8g published by AWS:

Instance type Instance store volumes Instance store type 100% random read IOPS / Write IOPS Needs initialization 1 TRIM support 2 i8g.large 1 x 468 GB NVMe SSD 75,000 / 41,250   ✓ i8g.xlarge 1 x 937 GB NVMe SSD 150,000 / 82,500   ✓ i8g.2xlarge 1 x 1875 GB NVMe SSD 300,000 / 165,000   ✓ i8g.4xlarge 1 x 3750 GB NVMe SSD 600,000 / 330,000   ✓ i8g.8xlarge 2 x 3750 GB NVMe SSD 1,200,000 / 660,000   ✓ i8g.12xlarge 3 x 3750 GB NVMe SSD 1,800,000 / 990,000   ✓ i8g.16xlarge 4 x 3750 GB NVMe SSD 2,400,000 / 1,320,000   ✓ i8g.24xlarge 6 x 3750 GB NVMe SSD 3,600,000 / 1,980,000   ✓ i8g.metal-24xl 6 x 3750 GB NVMe SSD 3,600,000 / 1,980,000   ✓ It seems like our numbers aren't aligned with these once we reach to 4xlarge, both for read IOPS, but especially for write IOPS which looks like reaching a limit. Please open an issue about it in seastar. I don't think we can merge it like that.

@roydahan Here the numbers are the result of running iotune on a RAID0 volume using the Scylla 2025.1 AMI (current one is per single disk basis, not on RAID0). Here the numbers are lower than the AWS ones, but they scale with the number of disks. Do you think these numbers are good, or are they still too low?

This also relates to the discussion in #608 about if the AWS numbers are the sum of all drives, or not.

instance_type read_iops read_bandwidth write_iops write_bandwidth
i8g.large 82269 605741802 45694 419297536
i8g.xlarge 164459 1216837674 91239 835287338
i8g.2xlarge 328373 2438733568 180457 1688050645
i8g.4xlarge 561816 4797149696 247491 3423312981
i8g.8xlarge 981711 8585749333 450101 6816110762
i8g.12xlarge 1427084 8585812650 630933 8586023082
i8g.16xlarge 1845487 8585807360 813391 8586123605
i8g.24xlarge 2689074 8585754965 1145914 8586258090
i8g.metal-24xl 2710970 8585782784 1142926 8575638698

syuu1228 avatar May 29 '25 08:05 syuu1228

@roydahan BTW, measurements using RAID0 are considered to be closer to actual performance. Should we need to switch all measurement values to RAID0 based one, instead of single drive one? It ofcause requires to re-measure everything.

syuu1228 avatar May 29 '25 09:05 syuu1228

The measurements of raid0 drive make sense to me. I didn't review every single number but it looks like they scale linearly as intended.

Correct me if I'm wrong, but I think we used to measure a single disk performance only to later multiply it by number of disks and this is how we used to get the linear scalability. Otherwise, I don't understand where the expectation of each disk to scale linearly with instance increase.

So, in short, it makes sense to me to measure the raid0 drive or to measure a single disk and multiply by number of disks but the former should be more accurate and tell us if there is an issue with specific instance type.

roydahan avatar May 29 '25 09:05 roydahan

@avikivity @roydahan Can we merge this now?

syuu1228 avatar Jun 02 '25 13:06 syuu1228

Created a side by side comparison (chatgpt did):

i8g instance performance comparison

instance_type read_iops vendor_read_iops read_delta write_iops vendor_write_iops write_delta
i8g.large 82K 75K +10% 46K 41K +11%
i8g.xlarge 164K 150K +10% 91K 82K +11%
i8g.2xlarge 328K 300K +9% 180K 165K +9%
i8g.4xlarge 562K 600K -6% 247K 330K -25%
i8g.8xlarge 982K 1200K -18% 450K 660K -32%
i8g.12xlarge 1427K 1800K -21% 631K 990K -36%
i8g.16xlarge 1845K 2400K -23% 813K 1320K -38%
i8g.24xlarge 2689K 3600K -25% 1146K 1980K -42%
i8g.metal-24xl 2711K 3600K -25% 1143K 1980K -42%

And this is with the data from above where calc_iops is single_disk_iops * nr_disks:

instance_type read_iops vendor_read_iops read_delta calc_read_iops write_iops vendor_write_iops write_delta calc_write_iops
i8g.large 82K 75K +10% 78K 46K 41K +11% 44K
i8g.xlarge 164K 150K +10% 157K 91K 82K +11% 88K
i8g.2xlarge 328K 300K +9% 313K 180K 165K +9% 175K
i8g.4xlarge 562K 600K -6% 622K 247K 330K -25% 348K
i8g.8xlarge 982K 1200K -18% 1242K 450K 660K -32% 695K
i8g.12xlarge 1427K 1800K -21% 1863K 631K 990K -36% 1043K
i8g.16xlarge 1845K 2400K -23% 2484K 813K 1320K -38% 1390K
i8g.24xlarge 2689K 3600K -25% 3733K 1146K 1980K -42% 2086K
i8g.metal-24xl 2711K 3600K -25% 3735K 1143K 1980K -42% 2086K

roydahan avatar Jun 04 '25 00:06 roydahan

Now we need to decide if we're happy with the numbers iotune measures on the raid0 vs the theortical number that is coming from single disk measurement * nr_disks. We can see that as the instance scales from 4x to 8x the numbers become less and less aligned with the published numbers.

@avikivity any suggestion what to do with it? Could it be a problem of iotune itself?

roydahan avatar Jun 04 '25 00:06 roydahan

Now we need to decide if we're happy with the numbers iotune measures on the raid0 vs the theortical number that is coming from single disk measurement * nr_disks. We can see that as the instance scales from 4x to 8x the numbers become less and less aligned with the published numbers.

@avikivity any suggestion what to do with it? Could it be a problem of iotune itself?

For that we need to compare with a long running fio. I wouldn't be surprised if they use some hero numbers, but we need to see.

mykaul avatar Jun 08 '25 15:06 mykaul

Closing in favor of https://github.com/scylladb/scylla-machine-image/pull/796

yaronkaikov avatar Sep 15 '25 08:09 yaronkaikov