posthog icon indicating copy to clipboard operation
posthog copied to clipboard

Support for S3 compatible destinations (MinIO) in batch exports

Open tiina303 opened this issue 1 year ago • 0 comments

There exists many S3 compatible destinations out there, in particular MinIO. The only thing required to configure the python boto3 S3 client to work with MinIO is passing one additional configuration (and potentially a couple more optional) to the client initialization (for example, this is the first search result: https://gist.github.com/heitorlessa/5b709df96ea6ac5ddc600545c0683d3b).

Some of these configuration parameters would have to be sourced from the user (like endpoint_url, or TLS support). But just exposing one (endpoint_url) would add support for MinIO destinations for free. Moreover, this would mean we can get rid of the mocking we do in tests that use MinIO (where we mock the S3 client to pass an endpoint_url to our dev stack MinIO). Potentially, this could add support for other S3 compatible destinations, but to keep the scope small let's focus on supporting MinIO by making local tests pass without mocking.

TODO:

  1. Expose new optional configuration parameters in S3BatchExportInputs (endpoint_url, verify or other variant to indicate TLS support in MinIO).
  2. Add frontend fields for the new configuration parameters.
  3. Pass new configuration parameters to S3 client initialization if available.
  4. Remove mocking from S3 tests, instead passing endpoint_url directly to S3BatchExportInputs when running locally without S3 variables configured.
  5. ???
  6. Profit.

tiina303 avatar Feb 13 '24 16:02 tiina303