cartography feat: Add s3_object resource to the aws module

Summary

This PR adds s3_objects support to the aws module. I will be making it more complete like using all the model classes I have contructed, I also feel I can make the file naming and structure better so will be also working on that.

Checklist

Provide proof that this works (this makes reviews move faster). Please perform one or more of the following:

[x] Update/add unit or integration tests.
[ ] Include a screenshot showing what the graph looked like before and after your changes.
[ ] Include console log trace showing what happened before and after your changes.

If you are changing a node or relationship:

[x] Update the schema and readme.

If you are implementing a new intel module:

[x] Use the NodeSchema data model.

May 23 '25 12:05 AdityaPandeyCN

Ooh this one is super interesting!

@kunaals @jychp @heryxpc

May 23 '25 13:05 achantavy

@achantavy interesting but can lead to very very long runs :)

@AdityaPandeyCN you did not choose the easiest one.

Adding the DEFAULT_MAX_OBJECTS_PER_BUCKET = 10000 is a good security, but I think we want to be able to configure that using the CLI.

You will have to work on following files:

cli.py
config.py

I think we want:

a default value
0 will disable object sync

Disabling object sync could be very important in a restricted environment (if their is any risk that object name contains PII/ePHI etc ...)

(I have a doubt with 0, setting a limit to 0 is usually for unlimited, but want to try to avoid adding a --enable-object-sync)

(@achantavy feel free to comment/object those requirements)

May 23 '25 13:05 jychp

Yeah I was gonna say, this is one where we want to make it very configurable and include documentation on how to turn it off because of how much data it can process and how much time it can add to a sync

May 23 '25 13:05 achantavy

But it's crazy useful

May 23 '25 13:05 achantavy

Yeah, that will allow so many new analyses use cases

May 23 '25 13:05 jychp

Thankyou for the comments, I am thinking of adding a s3-object-limit flag to control S3 object syncing. It'll default to 10000 objects per bucket, but users can set their own limit or disable it completely (with 0) for sensitive data environments. I'll start with the CLI and config changes. What do you think?

May 23 '25 15:05 AdityaPandeyCN

I feel this is ready for review.

May 26 '25 04:05 AdityaPandeyCN

Hi @AdityaPandeyCN can your handle the merge conflict ? Sorry for the delay in PR review, was very busy recently.

Jun 15 '25 08:06 jychp

@AdityaPandeyCN let me actually have a try at this, I'll push to this branch

Jun 17 '25 23:06 achantavy

Thankyou @achantavy I have checked edits from maintainers box, Do I have to do anything more?

Jun 18 '25 02:06 AdityaPandeyCN

I have addressed your suggestions.

Jun 27 '25 06:06 AdityaPandeyCN

feat: Add s3_object resource to the aws module

Summary

Related issues or links

Checklist