containers-roadmap icon indicating copy to clipboard operation
containers-roadmap copied to clipboard

[ECS] [Volumes]: Persistent volumes in EBS

Open nunofernandes opened this issue 5 years ago • 24 comments

Target:

Allow the possibility to allocate an existing EBS volume to a container in AWS ECS Optimized AMI without third party components.

Automatic EBS migration of the volume when the container starts on another AZ would be a nice feature.

nunofernandes avatar Dec 14 '18 09:12 nunofernandes

It seems you guys are working on a CSI driver for EBS[1]. If ECS added support for CSI as well that would be awesome.

The other options that I am aware of are the rexray and cloudstor docker volume plugins, but both of those have issues with the latest generation nitro instances. Future development of those plugins also seems uncertain.

  1. https://github.com/kubernetes-sigs/aws-ebs-csi-driver

talawahtech avatar Dec 30 '18 16:12 talawahtech

Thanks everyone for this request. It would really awesome if you could give us a little more detail about your need for this feature: For example, which workloads / applications that require EBS would you want to deploy on ECS? How would you imagine this working on ECS in an ideal scenario?

Akramio avatar Dec 31 '18 20:12 Akramio

Some workloads do require filesystem persistence (for example wordpress, django sites, some web apps). It is clear that EFS could be a solution for that but EFS is not, as of yet, available on all regions and EBS is.

This working on ECS would allow a more broad set of applications to be installed on ECS containers as persistence could be achieved using EBS volumes.

nunofernandes avatar Dec 31 '18 23:12 nunofernandes

My two main use cases are:

  1. Applications where block level storage is strongly recommended e.g Postgres, mongodb, 3rd party apps

  2. Being able to use EBS's snapshot feature to create a consistent backup of the filesystem, as well as being able to intitialize the filesystem from an existing snapshot

talawahtech avatar Jan 03 '19 05:01 talawahtech

We're currently running workloads which need to do perform local filesystem data calculations. Since we can't mount EBS/EFS, we're forced to download data explicitly all from S3 to the filesystem at boot time. The amount of data we have to download increases linearly with the amount of scale our customers require. This is causing our boot time to increase significantly. By switching to EBS, we would "skip" the data download step and take advantage of lazy data loading from S3 and perhaps do some upfront cache warming.

recursivefunk avatar Jan 04 '19 03:01 recursivefunk

@recursivefunk that one can be easily fixed by having a storage Docker that you mount a volume from in the other containers

FernandoMiguel avatar Jan 04 '19 06:01 FernandoMiguel

@FernandoMiguel interesting. If I understand correctly, there would still need to be a download step to load the data in a volume, but it would only happen once, and other ECS tasks could point to it. Not quite the instant S3 <-> Filesystem sync, but certainly a huge improvement. I didn't see anything about caching, either. I don't want to derail this thread 😅, happy to continue this convo elsewhere if you're up to it!

recursivefunk avatar Jan 04 '19 16:01 recursivefunk

@recursivefunk ping me on one of the many slacks about aws (i'm in most) as Fernando (case sensitive :) ) i'll show you examples.

FernandoMiguel avatar Jan 04 '19 16:01 FernandoMiguel

We need this as well. Our usecase: We're running Hybris in dockers and want to speed up the startup time of the dev and test environments. During the build process, we want to spin up a temporary stack which will run the hybris URS and runs the SOLR index jobs so all data is available for the current build. We will then take a snapshot from both the SOLR and MySQL EBS disks and tag them with the build number.

When we start up the dev or test environment, we want to be able to provide these snapshots based on the required build nr, which will be used to create a persistent EBS volume for the MySQL docker and the Solr docker. This should drastically speed up the startup times of our environments. We used to run SOLR with an EFS backed mount for the cores, but unfortunately we ran into issues with the maximum file locks. The amount of cores we have in SOLR are too many to be able to host it on EFS.

lotjuh avatar Jan 20 '19 19:01 lotjuh

Thanks everyone for your feedback and use-cases.

@nunofernandes thanks for your use-case. Can you provide a little more detail on why you would like to use "an existing EBS volume" (vs having EBS dynamically create a new volume and attach it, potentially based on a snapshot). For example, is this because you would like to migrate a pre-existing workload from EC2?

@recursivefunk , understood. In this case, the EBS volume lifecycle is tied to that specific Task: it is created specifically for that Task, and deleted when the Task dies.

@lotjuh Am I right in saying that this sounds like a read-only use-case, where like in @recursivefunk's use-case, the EBS volume lifecycle is tied to a specific Task (it is created specifically for that Task, and deleted when the Task dies.)? Do you currently deploy all your builds with the same Task Definition? Or do you create a separate Task Definition for each build number?

Akramio avatar Jan 21 '19 21:01 Akramio

I have created issue #127 to collect feedback for stateful services. Please provide +1s and use-cases in there if you believe your workload will require each task to have a unique identifier.

Akramio avatar Jan 21 '19 21:01 Akramio

@Akramio it can be an EBS dynamically created volume. I don't need it to be an existing one.

nunofernandes avatar Jan 22 '19 09:01 nunofernandes

@Akramio This is not a read-only use-case. When starting the ECS task, it will need a persistent volume to store the data. Both Solr and MySQL will need to be able to write to it as well. It will also need to persist if the task dies so it can be attached again to the new task that comes in it's place and no data will be lost. We do use separate task definitions per build which need to be pointing to separate snapshots to be used for the persistent EBS volume.

So what we need is the option to define a snapshot ID as part of the ECS service/task which will then be used to create a persistent EBS volume which will be mounted to our docker.

I hope this clarifies it

lotjuh avatar Jan 22 '19 09:01 lotjuh

@recursivefunk Just for me to understand better: In your scenario above are you assuming that ECS is using a pre-provisioned EBS volume that you have already created for each Task? Or that ECS is creating an EBS volume 'on the fly' when the Task is launched? If the latter, the download time is replaced with the time it takes to create and attach an EBS volume based on a snapshot.

Akramio avatar Jan 23 '19 01:01 Akramio

@Akramio Ideally, the ECS tasks would use a pre-provisioned EBS volume. We'd introduce a data processing step which creates the volume and subsequently-launched tasks would "discover" it. Hope that makes sense!

recursivefunk avatar Jan 23 '19 02:01 recursivefunk

@lotjuh and @nunofernandes would you be interested in running some of these stateful workloads you mentioned (Solr, MySQL, CMS) on Fargate?

On Fargate, you would likely not be able to run privileged containers or tune-in kernel-level parameters.

Akramio avatar Feb 04 '19 18:02 Akramio

@Akramio I would love to also have that feature on Fargate, but unfortunately Fargate is not yet available on the eu-west-3 region (my main region). For now I would be happy with regular ECS support.

nunofernandes avatar Feb 04 '19 19:02 nunofernandes

Having it available on fargate is not the main priority for us either. But if it is possible on fargate as well we'll most certainly use it.

lotjuh avatar Feb 04 '19 19:02 lotjuh

EBS (and EFS) on Fargate are definitely priorities in our organization. We use both via EC2 instances today to handle containerized versions of various COTS apps such as Gitlab and Nexus. The automation needed to support this is clumsy and we're moving all of our stateless containers to Fargate - definitely looking forward to the day when we can do that for stateful containers as well.

jtatum avatar Feb 21 '19 22:02 jtatum

Bump for 2019!

dreamingbinary avatar Dec 11 '19 17:12 dreamingbinary

And bump for 2020 :(

ztane avatar Jan 01 '20 23:01 ztane

Another use-case I don't see mentioned here is using fargate for isolated builds.

We use fargate tasks to run CI builds each time a developer pushes to a branch. This gives us a consistent environment every time we run a build, and provides great isolation between builds (because the container is destroyed at the end of the build), without having to worry about a fleet of EC2s running the containers. However some of our larger integration tests need more than 20Gb of disk space. EFS does not provide the bandwidth required for IO-intensive processes like spark, so the only options I can see is to attach an EBS volume to the task in order to provide fast-enough and large-enough storage. This storage could die along with the task though, there is no need for persistence.

tstibbs avatar Jun 25 '20 10:06 tstibbs

Another use-case that I have come across recently is for Fargate too (although applies to otherwise deployed on EC2 nodes) which is for Kafka related workloads which use RocksDB and being able to have RocksDB flush from RAM to disk.

Given that EKS (non Fargate) gets a whole driver for this, I just cannot understand how ECS is not getting as much love in getting similar features.

Yes, ECS and EKS aren't to be the same otherwise why have the two, but it just is odd to see such drive for the one and not for the other, especially on the one for which you (AWS) do not rely on a community to come up with patches, fixes and such.

Massive bump request for 2021!!!

JohnPreston avatar May 10 '21 12:05 JohnPreston

I would love to see this for my projects, this would simplify a lot of my deployments and avoid a lot of custom logic.

afretwell avatar Sep 02 '22 18:09 afretwell

This is launched now: https://aws.amazon.com/about-aws/whats-new/2024/01/amazon-ecs-fargate-integrate-ebs/

vibhav-ag avatar Jan 11 '24 23:01 vibhav-ag

This is launched now: https://aws.amazon.com/about-aws/whats-new/2024/01/amazon-ecs-fargate-integrate-ebs/

That's awesome! Do we have, by any chance, the estimated date that this feature will be released to more regions (like sa-east-1)? Thanks!

Dudssource avatar Jan 11 '24 23:01 Dudssource

Is this feature request actually resolved? Since the volumes aren't, you know, persistent.

From the docs:

Volumes that are attached to tasks that are managed by a service are not preserved and are always deleted upon task termination.

I'd expect a persistent volume to outlive the duration of a task.

jtatum avatar Jan 12 '24 00:01 jtatum

Is this feature request actually resolved? Since the volumes aren't, you know, persistent.

From the docs:

Volumes that are attached to tasks that are managed by a service are not preserved and are always deleted upon task termination.

I'd expect a persistent volume to outlive the duration of a task.

Wow, this is utterly ridiculous. This issue is definitely not resolved.

gunzy83 avatar Jan 12 '24 00:01 gunzy83

@gunzy83 @jtatum We understand this release doesn't address all the use cases called out on this issue. We're using this issue to track use cases that require an EBS volume to be reattached to tasks managed by a service when the task is terminated or redeployed. Please add your use case to the issue if you haven't already.

vibhav-ag avatar Jan 12 '24 01:01 vibhav-ag

This is launched now: https://aws.amazon.com/about-aws/whats-new/2024/01/amazon-ecs-fargate-integrate-ebs/

That's awesome! Do we have, by any chance, the estimated date that this feature will be released to more regions (like sa-east-1)? Thanks!

This will be coming out soon in additional regions.

vibhav-ag avatar Jan 12 '24 01:01 vibhav-ag