vector icon indicating copy to clipboard operation
vector copied to clipboard

Support overriding gcp_cloud_storage API URL (for local dev/test)

Open movinfinex opened this issue 1 year ago • 2 comments
trafficstars

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

The GCP Cloud Storage sink currently only supports writing to a real bucket at https://storage.googleapis.com/ It's useful for local development and integration testing to be able to point the sink to some kind of emulator instead. (E.g.: https://github.com/fullstorydev/emulators https://github.com/fsouza/fake-gcs-server)

Specifically, I need some way to make BASE_URL a configurable value, not just a constant.

Attempted Solutions

I think the only alternative is to use something like mitmproxy and get Vector to use a custom cert for https://storage.googleapis.com. I could do this for myself if I had to, but it's difficult to deploy across a team.

Proposal

One option would be to support the Google's own STORAGE_EMULATOR_HOST environment variable. Another option would be something like the endpoint parameter used in the datadog_logs sink.

Either way, I'm willing to write a PR.

References

No response

Version

vector 0.40.0 (aarch64-unknown-linux-gnu 1167aa9 2024-07-29 15:08:44.028365803)

movinfinex avatar Aug 23 '24 00:08 movinfinex

Thanks @movinfinex ! This request makes sense. We expose a similar parameter for the aws_s3 sink: https://vector.dev/docs/reference/configuration/sinks/aws_s3/#endpoint.

jszwedko avatar Aug 23 '24 22:08 jszwedko

Looks like an endpoint parameter would be most discoverable. (Users can always use the STORAGE_EMULATOR_HOST var in their configs if they really want.)

I've tried out the code in the PR. Vector uses the GCS XML API, which isn't well supported by emulators (yet), but a mock that returns a 200 response to any HTTP request is enough to keep the sink happy. (The sink does a HEAD request to /bucketname/ for the health check, and a PUT to /bucketname/object/key to upload objects.)

ghost avatar Aug 27 '24 05:08 ghost