opentelemetry-js icon indicating copy to clipboard operation
opentelemetry-js copied to clipboard

How can I cache spans and prevent exports?

Open yuanman0109 opened this issue 3 years ago • 8 comments
trafficstars

NB: Before opening a feature request against this repo, consider whether the feature should/could be implemented in the other OpenTelemetry client libraries. If so, please open an issue on opentelemetry-specification first.

Is your feature request related to a problem? Please describe.

In the browser environment, OTLPTraceExporter and BatchSpanProcessor are used. After configuration, they will be automatically exported. But sometimes I want spans to cache the data first, wait for some asynchronous events to return, and then merge the asynchronous data with spans before sending it to the server. It is expected that an interface method can control when to export. I don't know whether the current version provides it.

Additional context

image

yuanman0109 avatar Nov 03 '22 08:11 yuanman0109

Do you want to merge any data into ended spans?

How do you identify/search these spans?

Flarna avatar Nov 03 '22 08:11 Flarna

Do you want to merge any data into ended spans?

How do you identify/search these spans?

Yes, I hope that all spans will be cached for the time being and not sent directly. For example, I expect to add the userId of the user information to the resource.attributes after the asynchronous interface to obtain the user information returns, and then send them out. Otherwise, the userId will be missing from the data collected before the user information interface responds. At present, we are trying to solve the problem by modifying maxExportBatchSize to 0, but it is found that setting 0 will cause memory overflow. But I don't know what else to do

yuanman0109 avatar Nov 03 '22 09:11 yuanman0109

According to the spec resource is immutable. Also resource is shared across all spans and potentially across other signals like metric/logs.

To me it looks like a quite problematic use case. Providing general hooks to postprocess spans hold by a span processor doesn't sound really good to me. E.g. there might be other span processors installed sharing the span instances but using no/different sort of caching therefore a single span might be exported more then once with different set of attributes.

On the other hand it's most likely valid in your setup. Therefore I think the easiest way is to implement your own SpanProcessor and tune it for your needs.

Flarna avatar Nov 03 '22 10:11 Flarna

According to the spec resource is immutable. Also resource is shared across all spans and potentially across other signals like metric/logs.

To me it looks like a quite problematic use case. Providing general hooks to postprocess spans hold by a span processor doesn't sound really good to me. E.g. there might be other span processors installed sharing the span instances but using no/different sort of caching therefore a single span might be exported more then once with different set of attributes.

On the other hand it's most likely valid in your setup. Therefore I think the easiest way is to implement your own SpanProcessor and tune it for your needs.

The reason for this idea is that we have encountered some problems. We usually initialize first and register the required plug-ins. At this time, the collector starts to work immediately. After collecting the data, it will be sent through http. However, there is no user information in the data because the interface of user information is asynchronous. How can we add user information to resource as a public parameter? In addition, the data that has been sent before will be invalid by the server if there is no such user information I prefer the BatchSpanProcessor to provide a switch to control when to send. If there is no plan, I can only implement a BatchSpanProcessor myself

yuanman0109 avatar Nov 03 '22 12:11 yuanman0109

I understand your usecase but as of now the spec is quite clear regarding resources are immutable.

There were some attempts to change this but as of now that did not happen.

As of now the sequence is resource detection and then instantiate the TracerProvider. You can add any attributes you want to the resource at this time as Resource is an optional parameter to constructor of TracerProvider. But clearly this means no spans before resource detection is finished.

I prefer the BatchSpanProcessor to provide a switch to control when to send.

That's not enough to my understanding. You need additionally access to the stored spans to mutate them. Currently BatchSpanProcessor holds them in a private field as ReadableSpan.

If there is no plan, I can only implement a BatchSpanProcessor myself

That's true as you request is against the OTel spec therefore you should not expect that SDK implements this.

You could propose a OTel spec change regarding this in the spec repo.

Flarna avatar Nov 03 '22 13:11 Flarna

I understand your usecase but as of now the spec is quite clear regarding resources are immutable.

There were some attempts to change this but as of now that did not happen.

As of now the sequence is resource detection and then instantiate the TracerProvider. You can add any attributes you want to the resource at this time as Resource is an optional parameter to constructor of TracerProvider. But clearly this means no spans before resource detection is finished.

I prefer the BatchSpanProcessor to provide a switch to control when to send.

That's not enough to my understanding. You need additionally access to the stored spans to mutate them. Currently BatchSpanProcessor holds them in a private field as ReadableSpan.

If there is no plan, I can only implement a BatchSpanProcessor myself

That's true as you request is against the OTel spec therefore you should not expect that SDK implements this.

You could propose a OTel spec change regarding this in the spec repo.

If the OTEL specification is like this, we should follow it. But now the main problem is that we rely on the user ID when doing data analysis and query. However, this ID is obtained asynchronously on the browser side and has not been obtained when the TracerProvider is initialized, so the data sent has no ID. If I initialize the TracerProvider after getting the ID, some fast data (such as document loading) cannot be collected.

yuanman0109 avatar Nov 03 '22 15:11 yuanman0109

I understand your usecase but as of now the spec is quite clear regarding resources are immutable. There were some attempts to change this but as of now that did not happen. As of now the sequence is resource detection and then instantiate the TracerProvider. You can add any attributes you want to the resource at this time as Resource is an optional parameter to constructor of TracerProvider. But clearly this means no spans before resource detection is finished.

I prefer the BatchSpanProcessor to provide a switch to control when to send.

That's not enough to my understanding. You need additionally access to the stored spans to mutate them. Currently BatchSpanProcessor holds them in a private field as ReadableSpan.

If there is no plan, I can only implement a BatchSpanProcessor myself

That's true as you request is against the OTel spec therefore you should not expect that SDK implements this. You could propose a OTel spec change regarding this in the spec repo.

If the OTEL specification is like this, we should follow it. But now the main problem is that we rely on the user ID when doing data analysis and query. However, this ID is obtained asynchronously on the browser side and has not been obtained when the TracerProvider is initialized, so the data sent has no ID. If I initialize the TracerProvider after getting the ID, some fast data (such as document loading) cannot be collected.

This is for sure a gap in the spec that the project is aware of. I think right now the recommendation would be to use a span attribute.

dyladan avatar Nov 16 '22 16:11 dyladan

I understand your usecase but as of now the spec is quite clear regarding resources are immutable. There were some attempts to change this but as of now that did not happen. As of now the sequence is resource detection and then instantiate the TracerProvider. You can add any attributes you want to the resource at this time as Resource is an optional parameter to constructor of TracerProvider. But clearly this means no spans before resource detection is finished.

I prefer the BatchSpanProcessor to provide a switch to control when to send.

That's not enough to my understanding. You need additionally access to the stored spans to mutate them. Currently BatchSpanProcessor holds them in a private field as ReadableSpan.

If there is no plan, I can only implement a BatchSpanProcessor myself

That's true as you request is against the OTel spec therefore you should not expect that SDK implements this. You could propose a OTel spec change regarding this in the spec repo.

If the OTEL specification is like this, we should follow it. But now the main problem is that we rely on the user ID when doing data analysis and query. However, this ID is obtained asynchronously on the browser side and has not been obtained when the TracerProvider is initialized, so the data sent has no ID. If I initialize the TracerProvider after getting the ID, some fast data (such as document loading) cannot be collected.

This is for sure a gap in the spec that the project is aware of. I think right now the recommendation would be to use a span attribute.

I handle it this way, but I don't think it's a good way, I don't know what else to do:

async init() {
  const provider = new WebTracerProvider({
    idGenerator: new CustomIdGenerator(),
    resource: new Resource({
      [SemanticResourceAttributes.SERVICE_NAME]: 'app',
    }),
  })
  provider.register({
    contextManager: new ZoneContextManager(),
  })
  registerInstrumentations({
    instrumentations: [
      new XMLHttpRequestInstrumentation({}),
      new FetchInstrumentation({})
    ]
  })
  // get information asynchronously start
  const userId = await getUerId()
  // merge resource start
  provider.resource = provider.resource.merge(new Resource({ userId }))
  const tracerList = provider._tracers
  for(const t of tracerList.values()) {
    t.resource = provider.resource
  }
}

yuanman0109 avatar Nov 18 '22 06:11 yuanman0109

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

github-actions[bot] avatar Jan 23 '23 06:01 github-actions[bot]

This issue was closed because it has been stale for 14 days with no activity.

github-actions[bot] avatar Feb 13 '23 06:02 github-actions[bot]