onestop
onestop copied to clipboard
Add support for handling delete events in s3
Summary
As a | catalog |
I want to be able to | record if a file is deleted from an s3 bucket |
So that I can | keep IM up to date with file locations |
Tasks
In the onestop-clients/onestop-python-client codebase review: S3MessageAdapter SQSConsumer launch-e2e
-
[ ] Design discussion on how to implement
- Decision to use ElasticSearch for the lookup
-
[ ] Enable s3 delete event propagation (S3 event properties)
-
[ ] Add function to handle delete events coming from s3
Description/Reference
If an object is deleted from a bucket, it will trigger a message to SQS We have access to the key of the object Need to think about doing a look-up of object-uuid from IM
s3 delete sample message payload { "Type": "Notification", "MessageId": "e12f0129-0236-529c-aeed-5978d181e92a", "TopicArn": "arn:aws:sns:us-east-2:798276211865:cloud-archive-client-sns", "Subject": "Amazon S3 Notification", "Message": "{"Records":[{"eventVersion":"2.1","eventSource":"aws:s3","awsRegion":"us-east-2","eventTime":"2020-12-14T20:56:08.725Z","eventName":"ObjectRemoved:Delete","userIdentity":{"principalId":"AX8TWPQYA8JEM"},"requestParameters":{"sourceIPAddress":"65.113.158.185"},"responseElements":{"x-amz-request-id":"D8059E6A1D53597A","x-amz-id-2":"7DZF7MAaHztZqVMKlsK45Ogrto0945RzXSkMnmArxNCZ+4/jmXeUn9JM1NWOMeKK093vW8g5Cj5KMutID+4R3W1Rx3XDZOio"},"s3":{"s3SchemaVersion":"1.0","configurationId":"archive-testing-demo-event","bucket":{"name":"archive-testing-demo","ownerIdentity":{"principalId":"AX8TWPQYA8JEM"},"arn":"arn:aws:s3:::archive-testing-demo"},"object":{"key":"csv/file1.csv","sequencer":"005FD7D1765F04D8BE"}}}]}", "Timestamp": "2020-12-14T20:56:23.786Z", "SignatureVersion": "1", "Signature": "MB5P0H5R5q3zOFoo05lpL4YuZ5TJy+f2c026wBWBsQ7mbNQiVxAy4VbbK0U1N3YQwOslq5ImVjMpf26t1+zY1hoHoALfvHY9wPtc8RNlYqmupCaZgtwEl3MYQz2pHIXbcma4rt2oh+vp/n+viARCToupyysEWTvw9a9k9AZRuHhTt8NKe4gpphG0s3/C1FdvrpQUvxoSGVizkaX93clU+hAFsB7V+yTlbKP+SNAqP/PaLtai6aPY9Lb8reO2ZjucOl7EgF5IhBVT43HhjBBj4JqYBNbMPcId5vMfBX8qI8ANIVlGGCIjGo1fpU0ROxSHsltuRjkmErpxUEe3YJJM3Q==", "SigningCertURL": "https://sns.us-east-2.amazonaws.com/SimpleNotificationService-010a507c1833636cd94bdb98bd93083a.pem", "UnsubscribeURL": "https://sns.us-east-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:us-east-2:798276211865:cloud-archive-client-sns:461222e7-0abf-40c6-acf7-4825cef65cce" }
In order to do implement we need additional stories.
We need to capture object key in a key value store with IM UUID as the value.
Then when delete event is triggered from s3 we can lookup the UUID from the key/value store.
Use search api to find the object get the IM UUID.
Add a pull-request