martin icon indicating copy to clipboard operation
martin copied to clipboard

feat: migrate logging `log` -> `tracing`

Open CommanderStorm opened this issue 1 year ago • 10 comments

Lets split this up into two PRs

  • one changing the interface, tests and dependencys
  • one adding additional configuration options to martin AND documenting them

Alternatively, we can keep this PR open and I am going to get back to this sometime on the weekend

Resolves #836

we should continue using the RUST_LOG env var for all the configuration, with the possible addition of CLI and Config params

  • [x] RUST_LOG
  • [x] custom env variables MARTIN_LOG_FORMAT, MARTIN_LOG_LEVEL, ...
  • [x] deduplicating code
  • [x] CLI
  • [x] Config
  • [x] Writing docs (🦖)

CommanderStorm avatar Mar 04 '25 22:03 CommanderStorm

So I ran the performance benchmarks.. Seems like there is a supprising performance difference..

Will investigate where this is coming from.. I have some theories, such as my terminal not being the best at printing formatted output and such => will test them and report back and update the table below

experiment
baseline/log + with terminal output via ssh image image image
log + no terminal output image image image
tracing + with terminal output via ssh image image image
tracing + no terminal output image image image
tracing+ with terminal output via ssh + MARTIN_LOG_FORMAT=json image image image
tracing+ with terminal output via ssh + NO request logging via actix-tracing, but via actix-middleware + MARTIN_LOG_FORMAT=json image image image
tracing+ with terminal output via ssh + request logging via actix-tracing, but NOT via actix-middleware + MARTIN_LOG_FORMAT=json image image image
tracing + with terminal output via ssh + request logging via actix-tracing, but NOT via actix-middleware + MARTIN_LOG_FORMAT=json + interest cache in tracing_log image image image
tracing + no terminal output + request logging via actix-tracing, but NOT via actix-middleware + interest cache in tracing_log image image image

CommanderStorm avatar Mar 08 '25 13:03 CommanderStorm

Okay.. So I got performance up there again. Was a good call to do benchmarks 😅

I think there are severe performance issues with my ssh-terminal and rendering special characters => I am going to discount the initial and only consider MARTIN_LOG_FORMAT=json I don't think on a server, this is reasonably a problem.

The rqs went down on some of these, but 170k -> 130k is likely not something that users will see in reality. Will look further into what is causing the slowdown there

CommanderStorm avatar Mar 08 '25 15:03 CommanderStorm

Okay.. so apparently this gets logged..

image

https://docs.rs/tokio-postgres/latest/src/tokio_postgres/connection.rs.html#338

This is because we ask postgres do do so..

https://github.com/maplibre/martin/blob/31fc566d586a773d43a8ebb54bd3a0af89cf3519/tests/fixtures/functions/function_zxy_query.sql#L7

CommanderStorm avatar Mar 08 '25 17:03 CommanderStorm

So apparently some of our dependencys log via tracing (switch on the trace level to see it.. that is a lot). There is the expected change from not having so much semi-const stuff.. I think the ~relatively small change is fine..

=> I think performance is not a bottleneck anymore. Will investigate configurability next

CommanderStorm avatar Mar 08 '25 17:03 CommanderStorm

WOW! This is a very thorough job you did there! Thank you!

Ideally, we should continue using the RUST_LOG env var for all the configuration, with the possible addition of CLI and Config params. If we need more values like output format (vs output filtering), that may go into separate params too.

When doing performance, stdout + term + ssh is almost always a bad choice - there might be locking and syscall issues affecting it. Best to redirect it to /dev/null to avoid additional uncertainty, or perhaps dump to a file (but that might also be less than stable).

nyurik avatar Mar 08 '25 18:03 nyurik

No clue where CI and my local device is different. Will investigate further in the last part of the week.

image image

CommanderStorm avatar Mar 10 '25 11:03 CommanderStorm

@nyurik how would you want me to split this up? I don't think this is currently a reviewable amount.. How about martin-observability-utils and the rest? (or strictly on the crate boundary)

CommanderStorm avatar Mar 10 '25 11:03 CommanderStorm

Good question, I wish i knew the best way to proceed. Thinking out loud:

  • if perf is similar, and all existing functionality is preserved, we can introduce some "log format" param into the ecosystem
  • the monitoring utility is clearly a separate PR, perhaps martin-monitor / martin-watch / martin-log / martin-* -- my guess is that this should be a purely CLI tool for any kind of debugging, perhaps with a nice UI with Ratatui? The only uncertainty is what happens when multiple martin instances are running?

What do you think?

nyurik avatar Mar 11 '25 04:03 nyurik

  • the monitoring utility is clearly a separate PR

I think there is some misunderstanding. I am talking about the code pulled from the three crates to configure tracing.

There is no TUI involved and adding that would involve quite a bit more code than I think we should add for now (bigger issues and stuff ^^).

About naming: The name of the crate was to enable tracing/metrics to be outsourced there too and the utils part is because the other -utils crate exists.

Thinking about it: In the other crates in this repo, tracing/metrics are not relevant as they are CLI tools. This is only relevant for martin => extracting this to martin-observability-utils is likely a bad choice.. Likely martin-tracing is a better choice, or should I stick with martin-tracing-utils?

CommanderStorm avatar Mar 11 '25 11:03 CommanderStorm

Above I was talking about making this PR reviewable by splitting it up into smaller PRs. Currently this is a +900 -300 lines chonker of a PR (even though a large part of this is metadata and trivial changes..). There is still a lot of implementation.

I will split this up into:

  • one PR laying the basic groundwork (martin-tracing/martin-tracing-utils) => looking over if the API/naming I came up with is okay..
  • one PR integrating it on a basic level (rip out env-logger, change request logging, add CLI/cofig file option)
  • one PR doing the rest (rip out log, ..) <-- adding docs pushed to here as this is would be nice to release soon-ish after documenting

Alternative would be I will split this up into:

  • one PR laying the basic groundwork (martin-tracing/martin-tracing-utils)
  • one PR integrating this into martin/martin-cp
  • one PR integrating this into mbtiles

CommanderStorm avatar Mar 11 '25 12:03 CommanderStorm

there were some merge-issues and I think it is simplest to just redo the work I have done in this PR, if we get https://github.com/maplibre/martin/pull/1747 merged.

CommanderStorm avatar Jul 10 '25 20:07 CommanderStorm