avml icon indicating copy to clipboard operation
avml copied to clipboard

Splitting memory dump before S3 upload

Open dfir-man131 opened this issue 3 years ago • 1 comments
trafficstars

Would there be a way for you all to split the resulting memory dump into parts less than 5GB once acquired and then ship to S3? Running into issues with hitting the 5GB AWS pre-signed URL upload while trying to ship the full memory file. The mem is being compressed, however it still exceeds this 5GB limit. Any insight would be greatly appreciated! Thanks!

dfir-man131 avatar Jul 01 '22 19:07 dfir-man131

You should be able to use a tool like GNU coreutils split to split the files prior to uploading. If you're using split, try the --bytes=size argument.

We would be open to a contribution that adds support for multi-part uploads using the AWS S3 SDK.

bmc-msft avatar Jul 05 '22 20:07 bmc-msft

How about this - I PR a python script https://github.com/microsoft/avml/tree/main/eng/URI_list_example.py?

Splits the file into 4G chunks and uploads each chunk to a list of S3 presigned URIs located in a config.toml?

This has zero dependency on AWS S3 credentials, and you can use it to upload to any server that takes HTTP POST?

Open research question I am working on is optimizing RAM/CPU footprint and compressed chunk size for the upload.

chadbrewbaker avatar Aug 10 '23 16:08 chadbrewbaker

The difficulty with including such a script is maintaining it moving forward.

We are not likely to implement this request at this time, though we would be receptive to a PR that implements this as a compile-time feature.

demoray avatar Jan 22 '24 16:01 demoray