martin feat: migrate logging `log` -> `tracing`

Lets split this up into two PRs

one changing the interface, tests and dependencys
one adding additional configuration options to martin AND documenting them

Alternatively, we can keep this PR open and I am going to get back to this sometime on the weekend

Resolves #836

we should continue using the RUST_LOG env var for all the configuration, with the possible addition of CLI and Config params

[x] RUST_LOG
[x] custom env variables MARTIN_LOG_FORMAT, MARTIN_LOG_LEVEL, ...
[x] deduplicating code
[x] CLI
[x] Config
[x] Writing docs (🦖)

Mar 04 '25 22:03 CommanderStorm

So I ran the performance benchmarks.. Seems like there is a supprising performance difference..

Will investigate where this is coming from.. I have some theories, such as my terminal not being the best at printing formatted output and such => will test them and report back and update the table below

experiment
baseline/`log` + with terminal output via ssh
`log` + no terminal output
`tracing` + with terminal output via ssh
`tracing` + no terminal output
`tracing`+ with terminal output via ssh + `MARTIN_LOG_FORMAT=json`
`tracing`+ with terminal output via ssh + NO request logging via actix-tracing, but via actix-middleware + `MARTIN_LOG_FORMAT=json`
`tracing`+ with terminal output via ssh + request logging via actix-tracing, but NOT via actix-middleware + `MARTIN_LOG_FORMAT=json`
`tracing` + with terminal output via ssh + request logging via actix-tracing, but NOT via actix-middleware + `MARTIN_LOG_FORMAT=json` + interest cache in `tracing_log`
`tracing` + no terminal output + request logging via actix-tracing, but NOT via actix-middleware + interest cache in `tracing_log`

Mar 08 '25 13:03 CommanderStorm

Okay.. So I got performance up there again. Was a good call to do benchmarks 😅

I think there are severe performance issues with my ssh-terminal and rendering special characters => I am going to discount the initial and only consider MARTIN_LOG_FORMAT=json I don't think on a server, this is reasonably a problem.

The rqs went down on some of these, but 170k -> 130k is likely not something that users will see in reality. Will look further into what is causing the slowdown there

Mar 08 '25 15:03 CommanderStorm

Okay.. so apparently this gets logged..

https://docs.rs/tokio-postgres/latest/src/tokio_postgres/connection.rs.html#338

This is because we ask postgres do do so..

https://github.com/maplibre/martin/blob/31fc566d586a773d43a8ebb54bd3a0af89cf3519/tests/fixtures/functions/function_zxy_query.sql#L7

Mar 08 '25 17:03 CommanderStorm

So apparently some of our dependencys log via tracing (switch on the trace level to see it.. that is a lot). There is the expected change from not having so much semi-const stuff.. I think the ~relatively small change is fine..

=> I think performance is not a bottleneck anymore. Will investigate configurability next

Mar 08 '25 17:03 CommanderStorm

WOW! This is a very thorough job you did there! Thank you!

Ideally, we should continue using the RUST_LOG env var for all the configuration, with the possible addition of CLI and Config params. If we need more values like output format (vs output filtering), that may go into separate params too.

When doing performance, stdout + term + ssh is almost always a bad choice - there might be locking and syscall issues affecting it. Best to redirect it to /dev/null to avoid additional uncertainty, or perhaps dump to a file (but that might also be less than stable).

Mar 08 '25 18:03 nyurik

No clue where CI and my local device is different. Will investigate further in the last part of the week.

Mar 10 '25 11:03 CommanderStorm

@nyurik how would you want me to split this up? I don't think this is currently a reviewable amount.. How about martin-observability-utils and the rest? (or strictly on the crate boundary)

Mar 10 '25 11:03 CommanderStorm

Good question, I wish i knew the best way to proceed. Thinking out loud:

if perf is similar, and all existing functionality is preserved, we can introduce some "log format" param into the ecosystem
the monitoring utility is clearly a separate PR, perhaps martin-monitor / martin-watch / martin-log / martin-* -- my guess is that this should be a purely CLI tool for any kind of debugging, perhaps with a nice UI with Ratatui? The only uncertainty is what happens when multiple martin instances are running?

What do you think?

Mar 11 '25 04:03 nyurik

the monitoring utility is clearly a separate PR

I think there is some misunderstanding. I am talking about the code pulled from the three crates to configure tracing.

There is no TUI involved and adding that would involve quite a bit more code than I think we should add for now (bigger issues and stuff ^^).

About naming: The name of the crate was to enable tracing/metrics to be outsourced there too and the utils part is because the other -utils crate exists.

Thinking about it: In the other crates in this repo, tracing/metrics are not relevant as they are CLI tools. This is only relevant for martin => extracting this to martin-observability-utils is likely a bad choice.. Likely martin-tracing is a better choice, or should I stick with martin-tracing-utils?

Mar 11 '25 11:03 CommanderStorm

Above I was talking about making this PR reviewable by splitting it up into smaller PRs. Currently this is a +900 -300 lines chonker of a PR (even though a large part of this is metadata and trivial changes..). There is still a lot of implementation.

I will split this up into:

one PR laying the basic groundwork (martin-tracing/martin-tracing-utils) => looking over if the API/naming I came up with is okay..
one PR integrating it on a basic level (rip out env-logger, change request logging, add CLI/cofig file option)
one PR doing the rest (rip out log, ..) <-- adding docs pushed to here as this is would be nice to release soon-ish after documenting

Alternative would be I will split this up into:

one PR laying the basic groundwork (martin-tracing/martin-tracing-utils)
one PR integrating this into martin/martin-cp
one PR integrating this into mbtiles

Mar 11 '25 12:03 CommanderStorm

there were some merge-issues and I think it is simplest to just redo the work I have done in this PR, if we get https://github.com/maplibre/martin/pull/1747 merged.

Jul 10 '25 20:07 CommanderStorm