fluent-bit icon indicating copy to clipboard operation
fluent-bit copied to clipboard

OpenTelemetry input plugin not working with Envoy Proxy - throws HTTP 400 Error

Open dceravigupta opened this issue 2 years ago • 5 comments

Bug Report

Describe the bug I'm trying to use the opentelemetry input plugin (HTTP/Protobuf) to capture traces from Envoy Proxy. But fluentbit throws a 400 Bad Request error whenever Envoy tries to send a trace over.

image

To Reproduce

  • Here is the envoy configuration I'm using:
static_resources:
  listeners:
    - address:
        socket_address:
          address: "0.0.0.0"
          port_value: 30443
      filter_chains:
        - filters:
            - name: "envoy.filters.network.http_connection_manager"
              typed_config:
                "@type": "type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager"
                codec_type: "auto"
                stat_prefix: "ingress_https"
                tracing:
                  provider:
                    name: "envoy.tracers.opentelemetry"
                    typed_config:
                      "@type": "type.googleapis.com/envoy.config.trace.v3.OpenTelemetryConfig"
                      http_service:
                        http_uri:
                          uri: "http://127.0.0.1:5317/v1/traces"
                          cluster: "opentelemetry_collector"
                          timeout: "0.250s"
                route_config:
                  virtual_hosts:
                    - name: "backend"
                      domains:
                        - "*"
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: "Test"
                            auto_host_rewrite: true
                            timeout: "0s"
                http_filters:
                  - name: "envoy.filters.http.router"
                    typed_config:
                      "@type": "type.googleapis.com/envoy.extensions.filters.http.router.v3.Router"
          transport_socket:
            name: "envoy.transport_sockets.tls"
            typed_config:
              "@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext"
              common_tls_context:
                tls_certificates:
                  - certificate_chain:
                      filename: "localhost.crt"
                    private_key:
                      filename: "localhost.key"
                tls_params:
                  tls_maximum_protocol_version: "TLSv1_3"
                  tls_minimum_protocol_version: "TLSv1_2"
  clusters:
    - name: "Test"
      connect_timeout: "30.0s"
      type: "STRICT_DNS"
      dns_lookup_family: "V4_ONLY"
      lb_policy: "ROUND_ROBIN"
      load_assignment:
        cluster_name: "Test"
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: "google.com"
                      port_value: 443
      transport_socket:
        name: "envoy.transport_sockets.tls"
        typed_config:
          "@type": "type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext"
    - name: "opentelemetry_collector"
      type: "STRICT_DNS"
      lb_policy: "ROUND_ROBIN"
      load_assignment:
        cluster_name: "opentelemetry_collector"
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: "127.0.0.1"
                      port_value: 5317
  admin:
    address:
      socket_address:
        address: "0.0.0.0"
        port_value: 8001
layered_runtime:
  layers:
    - name: "admin_layer_0"
      admin_layer: {}


  • This configuration requires envoy version 1.28 or above.

Your Environment

  • Version used: Fluent Bit 2.1.10
  • Environment name and version (e.g. Kubernetes? What version?): Virtual Machine
  • Server type and version: Envoy Proxy 1.28
  • Operating System and version: Ubuntu 20.04LTS
  • Filters and plugins: in_opentelemetry

dceravigupta avatar Nov 16 '23 18:11 dceravigupta

@dceravigupta would you please provide more steps and fluent bit config to reproduce the problem ?

usually what takes time for us is to prepare the repro case, if you can provide it would be great

edsiper avatar Jan 22 '24 23:01 edsiper

@edsiper here is my fluent bit config:

[INPUT] name opentelemetry listen 127.0.0.1 port 5317 tag opentelemetry.3

[OUTPUT] name stdout match *

Let me try to repro it again with the latest fluent bit and will share the repro steps with all the configs soon. In the meantime, you can also checkout similar issue reported by other folks. https://github.com/fluent/fluent-bit/issues/7678

dceravigupta avatar Jan 23 '24 02:01 dceravigupta

@edsiper please download fluentbitenvoy.zip attachment under a directory on a Linux machine (I tried on Ubuntu)

Here are the repo steps:

  1. Open the bash and navigate to directory containing all the files.
  2. Install and launch fluent bit with the config in the attachment.

sudo apt-get install fluent-bit /opt/fluent-bit/bin/fluent-bit -c fluentbit.conf

  1. In second bash terminal, launch Envoy with the provided config

./envoy_1_28 -l debug -c envoyconfig.yml

  1. Even without hitting Envoy endpoint, you can see following errors in its logs:

image

  1. If you don't see any errors, in a third terminal, hit the following endpoint using curl.

curl -k http://localhost:30443/todos/1

fluentbitenvoy.zip

dceravigupta avatar Jan 25 '24 02:01 dceravigupta

@edsiper did you get a chance to look at the repro? Let me know if you need any more data points from my side. Thanks!

dceravigupta avatar Feb 09 '24 23:02 dceravigupta

might be relevant: https://github.com/fluent/fluent-bit/issues/8742

edsiper avatar Apr 24 '24 17:04 edsiper

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] avatar Jul 24 '24 01:07 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Jul 30 '24 01:07 github-actions[bot]