elastic-ci-stack-for-aws icon indicating copy to clipboard operation
elastic-ci-stack-for-aws copied to clipboard

Support specifying IOPs for instance root disk

Open nijave opened this issue 4 years ago • 7 comments

Thoughts on reverting https://github.com/buildkite/elastic-ci-stack-for-aws/pull/118 (or similar) now that gp3 has made IOPs cheaper/more accessible?

I'm interested to see if adding some IOPs for big docker builds/large repos makes a difference Related (interest in higher IOP setups) https://github.com/buildkite/elastic-ci-stack-for-aws/pull/557

nijave avatar Feb 02 '21 14:02 nijave

Hi @nijave! Would that mean parameterising the volume type as well? We're trying to avoid adding too many parameters to the stack.

Perhaps you could try it in a fork for a bit to see how it goes?

sj26 avatar Feb 03 '21 02:02 sj26

Hi @nijave! Would that mean parameterising the volume type as well? We're trying to avoid adding too many parameters to the stack.

Perhaps you could try it in a fork for a bit to see how it goes?

It looks like volume type is already parameterized here https://github.com/buildkite/elastic-ci-stack-for-aws/blob/master/templates/aws-stack.yml#L272

nijave avatar Feb 03 '21 02:02 nijave

Ah, right.

@chloeruka @pda what do you think?

sj26 avatar Feb 03 '21 02:02 sj26

Also worth noting these are significantly cheaper than the classic provisioned IOPs offerings (io1/2)

In us-east-1, they're $0.005/IOP/month vs $0.125-0.032 for io1/2 (so they're much more accessible than io* variants, price-wise)

nijave avatar Feb 03 '21 02:02 nijave

I wonder if we could pack the IOPS into RootVolumeType like gp3:6000. Although I think Fn::Split & Fn::Select would struggle with the optionality of the :6000 suffix :(

I would be fascinated to hear if gp3 with extra IOPS gives you a big performance boost.

pda avatar Feb 08 '21 08:02 pda

I would be interested to see the results as well. We've seen a big performance boost using the local NVME storage (#557) for docker and node build.

ouranos avatar Feb 08 '21 23:02 ouranos

We'd like to try with faster I/O as well and noticed that we cannot use io1 and io2 because:

You must use a valid fully-formed launch template. The parameter iops must be specified for io1 volumes. (Service: AmazonAutoScaling; Status Code: 400; Error Code: ValidationError; Request ID: f5b05e62-9a5a-4196-99c4-4a2a9de183f1; Proxy: null)

So, while for gp2/3 the IOPS parameterization might be a nice-to-have, it seems like for io1/2 it is a must-have.

Maybe https://github.com/buildkite/elastic-ci-stack-for-aws/pull/557 makes this ask irrelevant though, will have to see.

jgehrcke avatar Sep 09 '21 14:09 jgehrcke