apm-agent-rum-js icon indicating copy to clipboard operation
apm-agent-rum-js copied to clipboard

A TraceState HTTP header is being submitted with no value

Open jclusso opened this issue 3 years ago • 8 comments

We're having an issue where the TraceState header is being submitted with no value. This is breaking API requests with third party integrations of ours. If I disable distrubtedTracing in the RUM agent the problem goes away. Not sure how to provide more information on this.

jclusso avatar May 23 '22 15:05 jclusso

Hi @jclusso,

Thanks for using the RUM agent and investing your time on opening this issue!

TraceState header is being submitted with no value.

Is something like "tracestate:" the thing that you are seeing in the request headers of the network request to the apm intake api? Could you share a screenshot showing it?

This is breaking API requests with third party integrations of ours

How are you noticing the problems with the third-party API requests? What is the symptom?

Unless is confidential, could you share with us the name of third-party that is failing? and by the way, is your website accessible?

I also have a few questions related to versioning and configuration:

  • What version of the RUM agent and elastic stack are you using?
  • Could you share the agent initialisation snippet?

And last but not least, by default, the RUM agent does not propagate the tracestate HTTP header to the configured origins. More info here: https://www.elastic.co/guide/en/apm/agent/rum-js/current/distributed-tracing-guide.html#enable-tracestate

Thanks, Alberto

devcorpio avatar May 23 '22 16:05 devcorpio

@devcorpio I actually have no idea what is being sent in the request. We were getting odd errors in our integration with AWeber. We make requests to get lists from AWeber in a controller and apparently the TrceState header was being sent with no value they informed us after we reached out to them about the odd errors we were receiving.

Hi Jarrett, We were able to track down the cause of the issue.
On certain API requests, the client is setting a TraceState HTTP header but is omitting the value. The pattern seems to be that this happens for requests to the /accounts/<account ID>/lists endpoint specifically, but it's possible that other requests could be experiencing this as well.

The quickest fix would be to either omit the header entirely, or make sure it has a value when it's present in the request.

Since this only happens in some of your environments, it may be an artifact of testing that has escaped to production.

Let me know if you have any questions about this. I'll be happy to help!

As far as things go this is for https://app.emailable.com. You can sign up for a free account and connect an AWeber account. After you connect the account you'd want to try and import a list in our bulk tool from it. You can do all of this without actually spending any money in our platform.

RUM: 5.11.1 Elastic APM Ruby: 4.5.0 Stack: 8.2.0

jclusso avatar May 24 '22 01:05 jclusso

Hi @jclusso,

I'm assuming that you are setting propagateTracestate to true when initialising the agent, is that right?

I actually have no idea what is being sent in the request.

if you look for the request to the thirdparty api in your chrome devtools (netwok tab), you will be able to see the request headers that are being sent. Would it be possible to get a screenshot? (ofc, hiding the confidential data that you might have)

On the other hand, the RUM agent doesn't include the the header if there is no value, as you can see here

Then, I'm wondering if your issue is in the backend, so I would need more details your related to the issue:

  • Is the issue happening on the browser side? (RUM agent realm)
  • Is the issue happening in any of your Ruby services? (APM Ruby Realm)

Please, if possible, also share with us the initialisation snippet of both agents.

All this information is essential to figure out the cause of your issue.

Many thanks, Alberto

devcorpio avatar May 30 '22 14:05 devcorpio

@devcorpio we are not setting that option. Currently this is how we initialize it.

initApm({
  serviceName: 'Application Name',
  environment: process.env.RAILS_ENV,
  serverUrl: @data.get('apm-server-url'),
  serviceVersion: @data.get('version'),
  active: (process.env.RAILS_ENV != 'development'),
  transactionSampleRate: sampleRate,
  distributedTracing: false
})

We added distrubtedTracing: false to stop the error from happening.

if you look for the request to the thirdparty api in your chrome devtools (netwok tab), you will be able to see the request headers that are being sent. Would it be possible to get a screenshot? (ofc, hiding the confidential data that you might have)

You can't see the request to the third party API because it is a request we make in Ruby using Faraday. I don't believe this issue happens anywhere else. The issue only started when we added the RUM Agent.

As for the code for Ruby, we use Rails and it is auto initialized. We only set options for the following items.

  • transaction_sample_rate
  • stack_trace_limit
  • span_frames_min_duration
  • ignore_url_patterns

jclusso avatar May 31 '22 16:05 jclusso

Hi @jclusso,

Thanks for the additional details!

-- We have been able to reproduce the situation where tracestate header is sent with an empty value, that means that we will be able to think of possible solutions for the issue.

Let me add a bit of context:

There are two headers used for distributed tracing:

  • traceparent
  • tracestate (this one is optional even if you have the first one set)

The RUM agent sends traceparent by default for same-origin requests. It is also possible to add that header for cross-origin requests, but for that you need to use this config and configure your server

Also important to mention that the tracestate header is not being sent by default. This is this way because the logic for that was added long after the addition of traceparent. Sending it by default would have caused cross-origin requests to fail all of a sudden. That's why we have the propagateTraceState config available.

With that context in mind, let me explain then why the Ruby agent is sending the tracestate header with an empty value:

Imagine having an architecture like the one you can see in the screenshot below:

architecture

Important to keep in mind: APM Ruby agent also sends distributed tracing headers by default.

There are 3 scenarios, the one that is causing the issue is the 3rd one.

Scenario 1 (OK): RUM agent with tracing disabled:

Flow:

  1. Website sends a request to the Ruby API without the headers
  2. Ruby API sends a request to the Node.js API with both headers
  3. The Node.js receives a request containing both headers: traceparent and tracestate, both of them with values.

Scenario 2 (OK): RUM agent with tracing enabled (including the tracestate propagation):

Flow:

  1. Website sends a request to the Ruby API with the headers
  2. Ruby API have into account that those headers exist. Hence, do some logic with them and sends a request to the Node.js API
  3. The Node.js receives a request containing both headers: traceparent and tracestate, both of them with values.

Scenario 3 (KO): RUM agent with tracing enabled (BUT with the tracestate propagation DISABLED):

Flow:

  1. Website sends a request to the Ruby API with only the header traceparent
  2. Ruby API sees that the traceparent header exists. And because of that, it assumes that tracestate also is there (which is not true), and sends a request to the Node.js API including both headers, in this case tracestate value is empty
  3. The Node.js receives a request containing both headers: traceparent and tracestate, the former with value and the latter empty. We can see this in the screenshot below:
Screenshot 2022-06-01 at 18 28 09

A possible way to solve this is to add a check in APM Ruby agent that says "if tracestate doesn't has a value, don't send it". That approach is the one that is being followed in the Node.js APM: https://github.com/elastic/apm-agent-nodejs/issues/2405#issuecomment-969473613.

Extracted from that issue:

As to the HTTP spec, my read of the ABNF at https://datatracker.ietf.org/doc/html/rfc7230#section-3.2 is that an empty header value is explicitly allowed. However, this is about supporting picky proxies/servers that do not like empty header values.

--

Hello @elastic/apm-agent-ruby,

Could you help us with this issue?

Is there anything else that you need from us before deciding tackling this? (E.g. more details, to create a ticket, etc)

Thanks, Alberto

devcorpio avatar Jun 01 '22 16:06 devcorpio

@devcorpio thanks for that detailed explanation. It is pretty silly that picky servers iwll throw errors for this. I sort of understand what you've said, but don't know what configuration options correlate on the RUM and Ruby Agent to get the "OK" scenarios.

jclusso avatar Jun 01 '22 21:06 jclusso

Hi @jclusso,

Since the default configuration causes the "KO" scenario. From RUM perspective you only have two ways of overcoming this:

  • scenario 1: you need to disable the tracing settingdistributedTracing to false (as you are already doing)
  • scenario 2: you need to enable tracing again making sure that you set propagateTracestate to true as well.

devcorpio avatar Jun 02 '22 09:06 devcorpio

FWIW I am working on a draft PR to fix this on the APM Agent Ruby end also 👍

jaggederest avatar Jun 02 '22 10:06 jaggederest