retina icon indicating copy to clipboard operation
retina copied to clipboard

Add capture spec: `outputConfiguration.s3Upload`

Open parkjeongryul opened this issue 1 year ago • 3 comments

Is your feature request related to a problem? Please describe.

In some environments, blob storage can be difficult to use. It would be nice to support s3 upload for capture.

Describe the solution you'd like

Add an s3Upload spec to the following existing outputconfiguration.

spec.outputConfiguration: Indicates where the captured data will be stored. It includes the following properties:

blobUpload: Specifies a secret containing the blob SAS URL for storing the capture data. hostPath: Stores the capture files into the specified host filesystem. persistentVolumeClaim: Mounts a PersistentVolumeClaim into the Pod to store capture files.

s3Upload: Specifies a s3 upload url for storing the capture data.

The s3Upload spec might require the following additional fields

spec:
  outputConfiguration:
    s3Upload:
      endpoint: {{ s3-url }}
      bucket: {{ bucket_name }}
      accessKey: {{ access_key }}
      secretKey: {{ secret_key }}

Describe alternatives you've considered

Additional context

If this feature makes sense and I can be assigned to it, I'd like to work on implementing it.

parkjeongryul avatar Apr 01 '24 13:04 parkjeongryul

Thanks @parkjeongryul, we are definitely interested in the feature! I've assigned this issue to you and @spencermckee from our team.

I think that there's a bit of rework we need to do for #203 that could be common with your needs for this feature. Let's make sure we are aligned on the design here before we duplicate any work. How would you implement this for S3?

rbtr avatar Apr 01 '24 16:04 rbtr

CRD

Are you considering managing a single integrated ConfigMap or Secret for multiple output locations(azblob, s3), or do you plan to create a separate ConfigMap for each configuration?

If you decide on the latter, how about this spec for s3Upload?

apiVersion: retina.sh/v1alpha1
kind: Capture
metadata:
  name: example-capture
spec:
  captureConfiguration:
    captureOption:
      duration: "30s"
      maxCaptureSize: 100
      packetSize: 1500
    captureTarget:
      namespaceSelector:
        matchLabels:
          app: target-app
  outputConfiguration:
    s3Upload:
      endpoint: ""
      bucket: ""
      credentialsSecret:
        name: s3-credentials
        key: credentials

---

apiVersion: v1
kind: Secret
metadata:
  name: s3-credentials
type: Opaque
stringData:
  credentials: |-
    [default]
    aws_access_key_id = your_access_key_id
    aws_secret_access_key = your_secret_access_key

There are a lot of options for s3upload that will become more sophisticated like thanos, but for an initial implementation, it would be fine if only bucket, endpoint, access_key, and secret_key are specified.

Implementation

When it comes to the S3 upload sdk (golang), i think we have two choices.

  • https://github.com/aws/aws-sdk-go
  • https://github.com/minio/minio-go

I think they're both great libraries, good enough for uploading capture.

parkjeongryul avatar Apr 02 '24 16:04 parkjeongryul

@parkjeongryul, appreciate the help here!

Your proposal for a separate ConfigMap per output location and the S3 config example look reasonable. As for the S3 upload SDK, we would prefer aws-sdk-go v2 .

Feel free to get started with the implementation and let us know if you have any questions!

spencermckee avatar Apr 05 '24 19:04 spencermckee