skywalking icon indicating copy to clipboard operation
skywalking copied to clipboard

[Feature] Support traces and logs exporting.

Open wu-sheng opened this issue 3 years ago • 2 comments

Search before asking

  • [X] I had searched in the issues and found no similar feature requirement.

Description

This is a feature that was asked repeatedly, but we hold it for a long time, with concerns about the performance of upstream(exporting target). Now, the whole OAP backend is stable from an architectural perspective. I would like to begin this feature discussion and talk about possible solutions.

Use case

Exporting SkyWalking's traces and logs to the 3rd party

We had the metrics exporter, https://skywalking.apache.org/docs/main/next/en/setup/backend/metrics-exporter/, for years. Users could use this easy to build forward if they need most of the metrics, but on-demand query through GraphQL doesn't fit their requirements.

Same for traces and logs. users may have some platform, which wants to analyze logs and traces for business performance analysis, or AIOps, like @Superskyyy 's new project (https://github.com/SkyAPM/aiops-engine-for-skywalking)

Challenges

The major challenges about these exporting, are building

  1. A scalable channel to export data.
  2. A good limiter and breaker to make sure low perf upstream would not break or slow down the OAP server.
  3. Measure the exporting performance and speed in the self-observability

Milestone

I am not sure whether we should set milestones in 9.3.0. I would discuss it with @wankai123 to see.

Related issues

No response

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

wu-sheng avatar Oct 09 '22 08:10 wu-sheng

One specific challenge learnt from the AIOps engine (or similar analytics projects) is multiple consumers are needed to ingest data in a streaming fashion, but streaming rpcs are quite difficult to scale. I propose we choose Kafka etc. in this direction (avoid direct coupling with the consuming project) to export large data like traces and logs, so it is less likely to congest/ easier to scale.

Superskyyy avatar Oct 09 '22 15:10 Superskyyy

Kafka should be fine, but generally and in theory, it has no difference from cluster-supported gRPC streaming. Of course, we may consider Kafka as the default exporter channel as it is widely used.

wu-sheng avatar Oct 09 '22 15:10 wu-sheng

@Team this looks like an interesting feature. I do have some knowledge on the kafka side, would love to contribute in any possible way

mohammedtabish0 avatar Oct 21 '22 19:10 mohammedtabish0

This is done in https://github.com/apache/skywalking/pull/9817

kezhenxu94 avatar Oct 21 '22 23:10 kezhenxu94