feat: Add s3_object resource to the aws module
Summary
This PR adds s3_objects support to the aws module. I will be making it more complete like using all the model classes I have contructed, I also feel I can make the file naming and structure better so will be also working on that.
Related issues or links
Include links to relevant issues or other pages.
- https://github.com/cartography-cncf/cartography/issues/1552
Checklist
Provide proof that this works (this makes reviews move faster). Please perform one or more of the following:
- [x] Update/add unit or integration tests.
- [ ] Include a screenshot showing what the graph looked like before and after your changes.
- [ ] Include console log trace showing what happened before and after your changes.
If you are changing a node or relationship:
If you are implementing a new intel module:
- [x] Use the NodeSchema data model.
Ooh this one is super interesting!
@kunaals @jychp @heryxpc
@achantavy interesting but can lead to very very long runs :)
@AdityaPandeyCN you did not choose the easiest one.
Adding the DEFAULT_MAX_OBJECTS_PER_BUCKET = 10000 is a good security, but I think we want to be able to configure that using the CLI.
You will have to work on following files:
- cli.py
- config.py
I think we want:
- a default value
0will disable object sync
Disabling object sync could be very important in a restricted environment (if their is any risk that object name contains PII/ePHI etc ...)
(I have a doubt with 0, setting a limit to 0 is usually for unlimited, but want to try to avoid adding a --enable-object-sync)
(@achantavy feel free to comment/object those requirements)
Yeah I was gonna say, this is one where we want to make it very configurable and include documentation on how to turn it off because of how much data it can process and how much time it can add to a sync
But it's crazy useful
Yeah, that will allow so many new analyses use cases
Thankyou for the comments, I am thinking of adding a s3-object-limit flag to control S3 object syncing. It'll default to 10000 objects per bucket, but users can set their own limit or disable it completely (with 0) for sensitive data environments. I'll start with the CLI and config changes. What do you think?
I feel this is ready for review.
Hi @AdityaPandeyCN can your handle the merge conflict ? Sorry for the delay in PR review, was very busy recently.
@AdityaPandeyCN let me actually have a try at this, I'll push to this branch
Thankyou @achantavy I have checked edits from maintainers box, Do I have to do anything more?
I have addressed your suggestions.