chdig icon indicating copy to clipboard operation
chdig copied to clipboard

[Draft] Add perfetto style trace generation

Open UnamedRus opened this issue 5 months ago • 3 comments

Perfetto is an open-source suite of SDKs, daemons and tools which use tracing to help developers understand the behaviour of the complex systems and root-cause functional and performance issues on client / embedded systems.

It was suggested by good people, that it's one of very few decent trace viewers. https://ui.perfetto.dev/

2 commands added:

  1. Generate Perfetto .pb file
  2. Open in Perfetto, (start http server which serve simple page and trace.pb file, which implement deeplinking to ui.perfetto.dev)

Doesn't work

  1. Seems, events are sync? so, no other events are running during trace generation. (And it's takes a while)
  2. Big traces (>200MB-400MB) better to be processed by TraceProcessor server, not using WASM in ui.perfetto.dev https://perfetto.dev/docs/visualization/large-traces

Problems of generated trace/Perfetto:

  1. Cant filter counters by categories
  2. Particular remote processor event doesn't know from which node it's consumed data https://github.com/ClickHouse/ClickHouse/issues/77375 https://github.com/ClickHouse/ClickHouse/issues/77395
  3. clock_sync_failure during import of stack_traces https://github.com/ClickHouse/ClickHouse/issues/78234
  4. No memory tracing
  5. No flows out of system.processors_profile_log, but flows are not gonna work at amount of spans if we use opentelemetry_trace_processors=1
  6. Use of threads vs processors for query threads.
  7. Better deduplication for internedData

UnamedRus avatar Aug 03 '25 22:08 UnamedRus

@UnamedRus This looks very promising! Please ping me once you will finish and I will start review

Couple of thought son the current draft after brief look:

  • It is OK for now to execute the query again, but I am not sure that I like this (since any other actions do not do this, but this is another story, since it requires lots of info), let's at least underline this in the actions name
  • I see the reason for having separate action to store the profile data on disk, let's keep it for now, but not sure that this is a way to go
  • you may rely on symbols/lines as we do for system.trace_log over using trace
  • Can we render stacktraces for ProfileEvents changes in UI?
  • I guess we will also need to tune the UI to make it even better!

Seems, events are sync? so, no other events are running during trace generation. (And it's takes a while)

Right now it is true

azat avatar Aug 04 '25 09:08 azat

pull_request / Spell Check with Typos (pull_request)Failing after 8s

typo in proto definition.

But, it's kinda should work now. Trace size still an issue, as traces over 500MB doesn't work in browser and they need to be processed using extra tool - trace_processor which run as server docs and UI connects to it.

Can we render stacktraces for ProfileEvents changes in UI?

Does it have much value? StreamingStackTraces belongs to particular track or thread_id even, and i didn't figure out nice way to show multiple types of them, like Real/CPU yet per thread.

UnamedRus avatar Aug 31 '25 22:08 UnamedRus

typo in proto definition.

Let's add them into ignore list

Does it have much value?

It depends on the happened events, I guess it can be useful, but I am not sure, I need to play with it. One it will be ready form your side ping me and I will start looking into it.

azat avatar Sep 01 '25 09:09 azat