vector icon indicating copy to clipboard operation
vector copied to clipboard

Compression between vector and ElasticSearch is not working for egress traffic

Open okorolov opened this issue 3 years ago • 11 comments

Vector Version

0.15.0

Vector Configuration File

    [sinks.es]
      type = "elasticsearch"
      inputs = ["es_timestamp"]
      healthcheck.enabled = true

      auth.strategy = "basic"
      auth.user = "---"
      auth.password = "---"
      endpoint = "---:9200"
      mode = "data_stream"
      data_stream.auto_routing = false
      data_stream.sync_fields = false
      data_stream.type = "---"
      data_stream.dataset = "logs"
      data_stream.namespace = "ds"

      compression = "gzip"

      batch.max_bytes = 50000000
      batch.timeout_secs = 1
      buffer.max_events = 10000000
      buffer.type = "memory"
      buffer.when_full = "block"

Expected Behavior

When we configure compression = "gzip" for ElasticSearch sink it is expected that compression is applied in both directions - ingress and egress.

Actual Behavior

Compression is applied only to ingress traffic.

Workaround

Custom headers applied on the vector side fix the situation and the return traffic is also compressed.

    [sinks.es]
      compression = "gzip"
      # response compression
      request.headers.Accept-Encoding = "gzip"

Additional Context

ES version - 7.13.1 Also tested with vector - 0.12.2

Compression is clearly enabled on all ES nodes.

It might be that vector is not configuring its headers correctly when compression is enabled. It seems that ES is expecting to receive specific headers that will instruct it for using compression on the reply traffic (egress) https://discuss.elastic.co/t/http-compression-enabled-but-response-not-compressed/103513/7

Ingress Traffic Rates when we apply Compression inbound_traffic_es Egress Traffic Rates when we apply Compression outbound_traffic_es Egress Traffic Rates when we apply custom Headers and Compression outbound_traffic_es_headers

okorolov avatar Jul 23 '21 15:07 okorolov

Thanks @okorolov . It does appear that we should be including that header when compression = "gzip"

jszwedko avatar Jul 26 '21 19:07 jszwedko

@jszwedko Sorry to be a nuisance, but when you remove an issue from a milestone, does it mean the fix was included? Or does it mean it gets postponed until further notice? Thank you very much.

smlgbl avatar Nov 17 '21 15:11 smlgbl

Hey @smlgbl ! No worries. No, unfortunately in this case, it meant that it was deprioritized. I could see us trying to get it into 0.19.0 though. I'll add to that milestone.

jszwedko avatar Nov 17 '21 15:11 jszwedko

That would be great, even though the workaround mentioned above is rather trivial. Thanks for clarifying though.

smlgbl avatar Nov 17 '21 15:11 smlgbl

@jszwedko Just as a side-note: even if you don't get around to fixing the bug, at least put into to docs. We have our on-premise apps log to an OpenSearch cluster at AWS and our Outbound-Traffic costs were almost $200 per day. After setting the response header field, it's down to $6 per day!

smlgbl avatar Nov 19 '21 09:11 smlgbl

Oh wow, that is substantial. Thanks for the note @smlgbl . We'll prioritize fixing this for the next release.

jszwedko avatar Nov 19 '21 14:11 jszwedko

Hi @jszwedko

The workaround with a custom header has a significant downside.

Since Vector doesn't expect responses to be gzipped it doesn't try to read the response from Elasticsearch, hence it's silent even if there's an error in the response (ES "bulk" API can respond with HTTP 200 but with errors in the body):

2022-04-16T19:18:09.949786Z DEBUG sink{component_kind="sink" component_id=es component_type=elasticsearch component_name=es}:request{request_id=0}:http: vector::internal_events::http_client: HTTP response. status=200 OK version=HTTP/1.1 headers={"content-type": "application/json; charset=UTF-8", "content-encoding": "gzip", "content-length": "286"} body=[286 bytes]

I spent hours trying to figure out why I have only a half of the logs. Once I removed the custom headers I saw that Elasticsearch had some problems with dynamic mapping:

2022-04-16T19:20:52.998750Z ERROR sink{component_kind="sink" component_id=es component_type=elasticsearch component_name=es}:request{request_id=1}: vector::internal_events::elasticsearch: Response containerd errors. error_code=http_response_200 error_type="request_failed" stage="sending" response=Response { status: 200, version: HTTP/1.1, headers: {"content-type": "application/json; charset=UTF-8", "content-length": "375612"}, body: b"{\"took\":54,\"ingest_took\":50,\"errors\":true,\"items\":[{\"index\":{\"_index\":\"<index>\",\"_type\":\"_doc\",\"_id\":\"bAzSM4ABck9U0YcrKYIr\",\"status\":400,\"error\":{\"type\":\"mapper_parsing_exception\",\"reason\":\"Could not dynamically add mapping for field [app.kubernetes.io/managed-by]. Existing mapping for [kubernetes.pod_labels.app] must be of type object but found [text].\"}}}, <many similar errors>]}" }

vpedosyuk avatar Apr 16 '22 19:04 vpedosyuk

Thanks for bumping this @vpedosyuk . It fell off our radar. I've added this into our backlog.

jszwedko avatar Apr 19 '22 20:04 jszwedko

PR #13571 was incomplete- reverted it until we have support for reading compressed responses.

neuronull avatar Jul 18 '22 16:07 neuronull

@neuronull By the way, are there plans to fix it anywhere in the near future?

zamazan4ik avatar Sep 16 '22 17:09 zamazan4ik

@neuronull By the way, are there plans to fix it anywhere in the near future?

Q4 priorities are still being finalized, but decently high on the list is dedicating a number of weeks to addressing technical debt that we've been unable to get to, such as this issue. :pray:

neuronull avatar Sep 16 '22 22:09 neuronull