community.aws
community.aws copied to clipboard
s3_lifecycle - add abort_incomplete_multipart_upload and expired_object_delete_marker parameter
SUMMARY
I would like to apply a bucket lifecycle configuration with ~s3_bucket~ s3_lifecycle
.
My specific use case is configuring the bucket to abort abandoned multi-part uploads. (Otherwise you can end up paying for storage of data which is not visible in the bucket.)
ISSUE TYPE
- Feature Idea
COMPONENT NAME
~s3_bucket~ s3_lifecycle
ADDITIONAL INFORMATION
The bucket policy itself is supposed to be xml.
- amazon.aws.s3_bucket:
name: mys3bucket
state: present
lifecycle_configuration:
rules:
- id: sample-rule</ID>
status: Enabled
AbortIncompleteMultipartUpload:
DaysAfterInitiation: 7
I'm unsure whether the module should use the same capitalisation as the API (AbortIncompleteMultipartUpload
), or if we should rename to be consistent with Ansible (abort_incomplete_multipart_upload
) and then convert it.
I suspect using CamelCase would be better.
Files identified in the description:
- [
plugins/modules/s3_bucket.py
](https://github.com/['ansible-collections/amazon.aws', 'ansible-collections/community.aws', 'ansible-collections/community.vmware']/blob/main/plugins/modules/s3_bucket.py)
If these files are inaccurate, please update the component name
section of the description or use the !component
bot command.
cc @jillr @s-hertel @tremble @wimnat click here for bot help
Note: ansibullbot seems a bit broken. That list of links isn't rendering as expected. There's possibly some issue with the markdown. How do we get ansibullbot fixed?
Hi @mdavis-xyz,
This is a good idea, but we would suggest to prepare a new dedicated module s3_bucket_lifecycle_configuration
. Since it's a new module, it should be created first in community.aws
collection.
Files identified in the description:
- [
plugins/modules/s3_bucket_notification.py
](https://github.com/['ansible-collections/amazon.aws', 'ansible-collections/community.aws', 'ansible-collections/community.vmware']/blob/main/plugins/modules/s3_bucket_notification.py)
If these files are inaccurate, please update the component name
section of the description or use the !component
bot command.
cc @aljazkosir @miha-plesko @xlab-si click here for bot help
Ok, I can try to make a new module.
What's the process long term? Will this remain a new module forever? Or is that just a short-term location to ensure the code is correct, with a view to move it to the amazon.aws collection as a standalone module, or to that collection as part of the bucket module?
@mdavis-xyz We follow the UNIX philosophy - "Make each program do one thing well.". Since we are managing the lifecycle configuration of S3 bucket, I feel this makes sense to have a separate module that is easy to maintain and future-proof.
As a short-term goal, this new module will stay with the "community" collection. Once that matures, we can promote it to the "amazon.aws" collection (which is supported content).
Hope this makes sense.
I'm a bit confused by this request, since there is already the community.aws.s3_lifecycle
module doing this. The only thing missing in that module is the AbortIncompleteMultipartUpload
and ExpiredObjectDeleteMarker
settings. This was already mentioned in this issue https://github.com/ansible/ansible/issues/43434 prior to the migration.
To be able to manage this in the meantime, I've created a couple of tasks that calls the aws-cli. Something I've notice and that I think it's important to mention here: The AbortIncompleteMultipartUpload
and ExpiredObjectDeleteMarker
settings cannot be manage separately. They can only be one lifecycle configuration for a given prefix and those two settings are part of a lifecyle configuration. My original idea was to create a separate lifecycle for those two settings, but I was not able to for this reason.
Should we adjust this issue title and description to reflect the situation or should I create a new issue that better describe this? Let me know what you think is better.
cc @markuman click here for bot help
I fully agree @pgrenaud, thx for summarize and clearify the situation.
- https://docs.ansible.com/ansible/latest/collections/community/aws/s3_lifecycle_module.html
- https://github.com/ansible-collections/community.aws/blob/main/plugins/modules/s3_lifecycle.py
- https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.put_bucket_lifecycle
I guess expanding the existing module with two more parameters is not a big task.
Is anyone of you (@pgrenaud @mdavis-xyz ) interesting to provide a PR?
How should we structure the interface?
Like this?
- name: Create a lifecycle policy, with abort_incomplete_multipart_upload (idempotency)
s3_lifecycle:
name: '{{ bucket_name }}'
abort_incomplete_multipart_upload:
days_after_initiation: 1
prefix: /something
Is it too verbose to have:
abort_incomplete_multipart_upload:
days_after_initiation: 1
(That matches the underlying API)
Or should I make it like:
abort_incomplete_multipart_upload_days: 1
i.e. do I try to make it simplify the API in an opinionated way, which is less verbose, or do I make it match the underlying API as closely as possible, which makes adapting to new features in the API easier?
For expiry, the existing expiration_days
and expiration_date
args are top level, whereas the API nests them, along with expire object delete marker.
So should I promote the object delete market to the top level? What do I call it?
expiration_expired_object_delete_marker
matches the naming of the existing arguments, but sounds a bit verbose.
expire_object_delete_marker
sounds more sensible, although that uses the word expire
when the underlying API is about expiration
and expired
. Is that ok?
The API doesn't let you do expired object delete marker, as well as expire days. Should I try to implement that exclusion behavior when validating input arguments? Or just pass it through and rely on AWS to reject the bad input?
Also, when submitting a PR, do I need to update a changelog file somewhere?
How should we structure the interface?
Imo, keeping the API structure makes it easier to maintain the code along to boto3 and to add more features.
Maybe there comes more features under key abort_incomplete_multipart_upload
, but maybe not. We don't know :)
expiration_expired_object_delete_marker
matches the naming of the existing arguments, but sounds a bit verbose.expire_object_delete_marker
sounds more sensible
I agree. Optional you can keep the origin key as an alias parameter.
The API doesn't let you do expired object delete marker, as well as expire days. Should I try to implement that exclusion behavior when validating input arguments? Or just pass it through and rely on AWS to reject the bad input?
If you see this usecase/this errors already. Maybe yes. But it also must be documented.
Also, when submitting a PR, do I need to update a changelog file somewhere?
Yes, when adding new features, you also need to a a changelog/fragments/
file.
See: https://docs.ansible.com/ansible/latest/community/development_process.html#creating-a-changelog-fragment
If we want the multipart abort to match the API, should we also change expiration to match the API?
Instead of:
expiration_days:
expiration_date:
expire_object_delete_marker:
It would match more closely if we did:
expiration:
date:
days:
expired_object_delete_marker:
That requires care to make the change backwards compatible. Currently you can only specify one of these 3 anyway. So should I keep the pattern for expire
of not matching the API, and add abort in a way that matches the API?
That requires care to make the change backwards compatible.
hm, what did you mean? Atm s3_bucket
does not support any expire_
parameters.
And I noticed that the module has moved to amazon.aws
collection. So the Issue will be moved to https://github.com/ansible-collections/amazon.aws
@markuman The slightly confusing thing here is that s3_lifecycle (which manages the lifecycle configuration of s3 buckets) lives in community.aws, whereas s3_bucket (which creates the buckets) lives in amazon.aws.
In my opinion it's better for us to split out some of this configuration as massive monolithic modules become very difficult to properly test, and as such tend to be less reliable, compared to splitting the functionality out. This becomes even more pronounced when you can apply multiple lifecycle rules based on the folders within a bucket.
At some point Amazon will propably replace boto3 with something incompatible, migrating small modules with well defined scope is massively simpler than a monolith that does "all things s3" (see also iam, ec2, etc).
Ah oh boy. Blame me. 🙈
I was confused because the inition post included still s3_bucket
. But it's right. as mentioned above, the functionality must be included s3_lifecycle
So the issue musst moved back to community.aws
. Sorry :D
If we want the multipart abort to match the API, should we also change expiration to match the API?
Instead of:
expiration_days: expiration_date: expire_object_delete_marker:
It would match more closely if we did:
expiration: date: days: expired_object_delete_marker:
That requires care to make the change backwards compatible. Currently you can only specify one of these 3 anyway. So should I keep the pattern for
expire
of not matching the API, and add abort in a way that matches the API?
Imo to not break backwards compatibility, it's better to keep the existing structure imo.
There's two options:
- Keep the current structure
- Deprecate the current structure and use something closer to the API
I would actually lean towards 2 as it feels slightly cleaner, but I would be tempted to do something like
expiration:
date:
days:
delete_marker: (with an alias of expired_object_delete_marker)
I've submitted a pull request with two new top-level arguments. I'm not sure how a decision like this is usually made. It sounds like there are two maintainers with differing opinions. Just tell me which approach you want and I'll change my PR if needed.
If you want to change the existing expiration fields, please name another module that uses an alias with a nested parameter, so I can go see how that can be done.
I'm fine with both :)