tracing icon indicating copy to clipboard operation
tracing copied to clipboard

Level filtering is broken when depending on tracing-subscriber from crates.io

Open ConradIrwin opened this issue 10 months ago • 10 comments

Bug Report

Version

mixed

Platform

macOS

Description

  1. Depend on tracing-subscriber = { version = 0.3.18, features = ["env-filter", "json", "registry", "tracing-log"] } from crates.io

  2. Set an env filter to a string like warn,collab::rpc=info

  3. Notice that output from tracing::error! inside collab::rpc with level ERROR is not printed to the console.

  4. Replace tracing-subscriber = { git = "https://github.com/tokio-rs/tracing", rev = "tracing-subscriber-0.3.18", features = ["env-filter", "json", "registry", "tracing-log"] }

  5. Notice that output from tracing::error! inside collab::rpc with level ERROR is printed to the console

We dug into this a little bit and noticed that the published version of tracing-subscriber depends on tracing-core v0.1.30 but when we depend on the git repository directly we get a version of tracing-core that's just a bit newer than v0.1.32. It seems to be the case that when the versions mismatch like this the filtering logic is sometimes wrong.

ConradIrwin avatar Apr 04 '24 19:04 ConradIrwin

Can you replicate this if you set a dependency on tracing-core of 0.1.32 in your Cargo.toml? Is there a branch where I can try and replicate this?

(I'm guessing is this in Zed: is it https://github.com/zed-industries/zed/pull/10085?)

davidbarsky avatar Apr 04 '24 19:04 davidbarsky

That does not seem to fix the problem, so I guess the difference is in the commits between 0.1.32 and the tracing-subscriber release.

You can test here: https://github.com/zed-industries/zed/commit/d56730b82b88025eb7fe81736ec339c1494eafa4 vs https://github.com/zed-industries/zed/commit/d1c6b4acfe5d4a556eda38a9f20ac36ca78bfc9a

By running:

RUST_LOG=warn,collab::rpc=info cargo run --package=collab serve

(Though you'll need to set up the local collaboration dev dependencies: https://zed.dev/docs/local-collaboration)

ConradIrwin avatar Apr 04 '24 19:04 ConradIrwin

Well, this is extremely strange; the 0.3.18 release on crates.io should be exactly the same as the tagged commit for tracing-subscriber-0.3.18...the only difference I can think of is the path dep on tracing-core that you get with the repo version, but I think the explicit tracing dep will still pull the release version of tracing and tracing-core.

Can you run cargo tree -i -p tracing-core on the working and non-working commits?

hawkw avatar Apr 04 '24 20:04 hawkw

On the version I've marked as broken, the output is here: https://gist.github.com/ConradIrwin/dc0103d52e390eae9ebacc87c311cdb4.

On the commit I've tagged as working, I get:

    Updating git repository `https://github.com/tokio-rs/tracing`
error: There are multiple `tracing-core` packages in your project, and the specification `tracing-core` is ambiguous.
Please re-run this command with one of the following specifications:
  registry+https://github.com/rust-lang/crates.io-index#[email protected]
  git+https://github.com/tokio-rs/tracing?rev=tracing-subscriber-0.3.18#[email protected]

So I guess both are being included, but as there are commits in the git repo's tracing-core directory at tag tracing-subscriber-0.3.18 that are not released by tracing-core-0.1.32 this is not actually running the same code as the released crate.

I haven't yet tried bisecting these commits, but a quick scan of the messages reveals no obvious culprits. It's also worth noting that the tracing-subscriber v0.3.18 depends on tracing-core v0.1.30 which was not the latest release at the time it was made, though that doesn't seem to be the cause of this bug, it also seems like a potential issue.

This could also be a bug in any of the other tracing-* crates; because they are all released on different schedules, the versions on crates.io will rarely correspond to a state of the git repository that ever existed.


In terms of narrowing the bug down:

  • It seems like this only happens when you have two different levels in the env filter warn,rpc::collab=info is broken, but warn,rpc::collab=warn is not.
  • We also observed this before we upgraded to the latest releases (which we did to try and solve this problem). The same issue happens on current zed main (https://github.com/zed-industries/zed/commit/0f1c2e6f2b35f471b5f0079b0114db75d40c0069), which is requesting tracing = "0.1.34", tracing-subscriber = { version = "0.3.11", features = ["env-filter", "json", "registry", "tracing-log"] } and resolving to v0.1.31 and v0.3.17 respectively.

ConradIrwin avatar Apr 05 '24 01:04 ConradIrwin

Looking into this more, it seems like there's some kind of unexpected interaction with sqlx. I suspect that the reason it "works" when we include the git version is because sqlx is still depending on the non-git version and so shared state is not mutated.

From the broken commit on zed above, I narrowed down the test case, and it seems that if I don't make any SQL queries between init'ing the tracing subscriber and logging, then the log levels are preserved.

I also observe that I can control logging of collab::rpc's error logs by changing the filter on the sqlx package:

RUST_LOG=warn,sqlx=info,collab::rpc=info cargo run --package=collab serve # shows error logs in collab::rpc
RUST_LOG=warn,sqlx=warn,collab::rpc=info cargo run --package=collab serve # does not show error logs in collab::rpc

It also seems that if I build zed's collab using sqlite instead of Postgres, that fixes the problem.

ConradIrwin avatar Apr 05 '24 02:04 ConradIrwin

Ok, zooming in more; the problem is this call in the sqlx-postgres crate: https://github.com/launchbadge/sqlx/blob/635dba5b2682033101a1271e9fb4bf2516c0b840/sqlx-postgres/src/connection/stream.rs#L154-L156

A minimal reproduction of this bug is https://github.com/ConradIrwin/tracing-bug/commit/00a1b1492a6435ebea70eac405270aa5163118d1#diff-42cb6807ad74b3e201c5a7ca98b911c5fa08380e942be6e4ac5807f8377f87fc

ConradIrwin avatar Apr 05 '24 02:04 ConradIrwin

Oh, thank you for the minimal reproduction—that is a bit odd. I bet we're doing something silly when interacting with log.

davidbarsky avatar Apr 05 '24 15:04 davidbarsky

Hi ! We're also hitting a similar behavior. The bug doesn't seem to be due to sqlx, but rather to tracing-subscriber. Has it been reported upstream ? I didn't find an existing issue to follow.

vsiles avatar Sep 05 '24 06:09 vsiles

I didn't get much further in tracking it down. There's a reproduction here: https://github.com/tokio-rs/tracing/issues/2927#issuecomment-2038710272. I don't think the string sqlx is necessary, any string will do, but the next step is to try and figure out why it's broken.

ConradIrwin avatar Sep 05 '24 14:09 ConradIrwin

Hi ! We're also hitting a similar behavior. The bug doesn't seem to be due to sqlx, but rather to tracing-subscriber. Has it been reported upstream ? I didn't find an existing issue to follow.

This is the tracking issue for now.

I found this is because when using log::log_enabled! we directly call enabled on the current subscriber: https://github.com/tokio-rs/tracing/blob/6d00d7d9f72dc6797138a1062bc33073afbad5a1/tracing-log/src/log_tracer.rs#L193

The problem is enabled takes the supplied level and also stores the result of the filter in the FilterState, so it gets cached. For example this example:

let filter = tracing_subscriber::filter::EnvFilter::from_str("warn,collab=info").unwrap();

tracing_subscriber::registry()
    .with(Box::new(
        tracing_subscriber::fmt::layer().with_filter(filter),
    ))
    .init();

let _ = log::log_enabled!(
    target: "sqlx",
    log::Level::Info
);

tracing::warn!("1 WARN");
tracing::warn!("2 WARN");

outputs:

2024-09-09T12:45:10.411840Z  WARN log: 2 WARN

So basically log::log_enabled! (and tracing::enabled!) calls this function which caches the filter's result: https://github.com/tokio-rs/tracing/blob/6d00d7d9f72dc6797138a1062bc33073afbad5a1/tracing-subscriber/src/filter/layer_filters/mod.rs#L751-L776 Then the following events (warn! calls) call events_enabled, which check the cached value before (sometimes) calculating it themselves: https://github.com/tokio-rs/tracing/blob/6d00d7d9f72dc6797138a1062bc33073afbad5a1/tracing-subscriber/src/filter/layer_filters/mod.rs#L806-L820

Now I haven't found yet (and I don't understand correctly how) we clear the FilterMap, because after the first event, the cached values are gone and everything works as expected.

This does not happen if instead of log::log_enabled! we use tracing::enabled!, because there we build a callsite and check the interest of the current subscriber. The interest of the subscriber will always be never because Info is lower than our current minimum level Warn.

kaffarell avatar Sep 09 '24 13:09 kaffarell