inferno icon indicating copy to clipboard operation
inferno copied to clipboard

Single stack detection can be wrong if the event contains multiple colon

Open YangKeao opened this issue 1 year ago • 3 comments

I'm profiling runtime.newobject in a program written in golang, and the event created by uprobe is runtime:newobject. An example of the perf script content is:

tidb-server 123720 [000] 11736.070036: runtime:newobject: (1843c20)
                 1443c20 runtime.newobject+0x0 (/home/yangkeao/.tiup/components/tidb/v7.1.0-alpha-nightly-20230326/tidb-server)
                 19da5cc golang.org/x/net/http2.(*Framer).ReadFrame+0x1ac (/home/yangkeao/.tiup/components/tidb/v7.1.0-alpha-nightly-20230326/tidb-server)
                 1a16057 google.golang.org/grpc/internal/transport.(*http2Client).reader+0x257 (/home/yangkeao/.tiup/components/tidb/v7.1.0-alpha-nightly-20230326/tidb-server)
                 1a0b82a google.golang.org/grpc/internal/transport.newHTTP2Client.func11+0x2a (/home/yangkeao/.tiup/components/tidb/v7.1.0-alpha-nightly-20230326/tidb-server)
                 14a9661 runtime.goexit.abi0+0x1 (/home/yangkeao/.tiup/components/tidb/v7.1.0-alpha-nightly-20230326/tidb-server)

The inferno will treat the first line as a single stack event, and tell the (1843c20) and all following lines are weird event line.

YangKeao avatar Mar 27 '23 08:03 YangKeao

Oh interesting! I think this suggests that we actually need to split by whitespace, not :, to get the event in the logic here: https://github.com/jonhoo/inferno/blob/6326ced534f8dad96cf882c939171c101cca3e33/src/collapse/perf.rs#L377-L380

Would you be interested in submitting a PR for this?

jonhoo avatar Apr 02 '23 19:04 jonhoo

I took a look at this but didn't see a great way of distinguishing this case from the single stack cases in https://github.com/jonhoo/inferno/blob/6326ced534f8dad96cf882c939171c101cca3e33/src/collapse/perf.rs#L402-L403 without either

  • trying to parse the rest of the line as a stack line best-effort and allowing that to fail (which doesn't seem great if it is a legitimate single-stack line)
  • trying to lookahead to see if the next line is a regular stack line or an event line (which is maybe the best shot but would require some restructuring)

Any idea if I'm missing something, and if not, any opinion on which approach to take?

jacksonriley avatar May 01 '23 18:05 jacksonriley

Hmm, that's a tricky one. After some digging, how about something like this to replace the first part of on_event_line (immediately inside the first if let Some)?

// perf script does not produce easy-to-parse lines.
// In particular, lots of fields can contain characters we might consider delimiters.
// We know that every line has at least one : between the process/thread info and the event
// info. The process/thread info _usually_ doesn't have : in it (technically the program name
// _can_, though we ignore that possibility for now (TODO)). So here we extract the event info:
let after_first_colon = line[end..].split_once(':').map(|(_, rest)| rest);
// The event itself has many space-separated fields. We want the event type (e.g., "cycles" or
// function name), which always ends in a :, so find the first field for which that's the case:
let event_with_mods = after_first_colon.and_then(|s| {
    let s = s.trim();
    let event_with_mods = s.split_whitespace().find(|field| field.ends_with(':'))?;

    // Would be nice to use https://github.com/rust-lang/rust/issues/77998
    // This find is unlikely to have false positives since the event name is usually only
    // preceeded by a numeric field.
    let rest = s.find(event_with_mods).expect("we found it above") + event_with_mods.len();
    let rest = s[rest..].trim();

    let event_with_mods = event_with_mods
        .strip_suffix(':')
        .expect("known to end in :");

    Some((event_with_mods, rest))
});
// The event type can itself contain :, but it also has an optional "modifier" list at the end.
// That modifier list can only consist of [ukhIGHpPSDWe], where only p may appear more than
// once (see perf help list).
//
// We want to strip the modifiers at the end if they're there, and keep any other colons
// intact.
fn is_likely_event_modifier(s: &str) -> bool {
    if !s.is_ascii() {
        return false;
    }
    if s.len() > 11 + 3 {
        // Longer than all modifiers + 3 ps!
        return false;
    }

    // Make sure only valid modifiers appear, and only in allowed quantities.
    const MOD_u: usize = 1 << 0;
    const MOD_k: usize = 1 << 1;
    const MOD_h: usize = 1 << 2;
    const MOD_I: usize = 1 << 3;
    const MOD_G: usize = 1 << 4;
    const MOD_H: usize = 1 << 5;
    const MOD_P: usize = 1 << 6;
    const MOD_S: usize = 1 << 7;
    const MOD_D: usize = 1 << 8;
    const MOD_W: usize = 1 << 9;
    const MOD_e: usize = 1 << 10;
    let mut bitset = 0;
    let mut ps = 0;
    for c in s.bytes() {
        match c {
            b'u' if bitset & MOD_u == 0 => bitset |= MOD_u,
            b'k' if bitset & MOD_k == 0 => bitset |= MOD_k,
            b'h' if bitset & MOD_h == 0 => bitset |= MOD_h,
            b'I' if bitset & MOD_I == 0 => bitset |= MOD_I,
            b'G' if bitset & MOD_G == 0 => bitset |= MOD_G,
            b'H' if bitset & MOD_H == 0 => bitset |= MOD_H,
            b'P' if bitset & MOD_P == 0 => bitset |= MOD_P,
            b'S' if bitset & MOD_S == 0 => bitset |= MOD_S,
            b'D' if bitset & MOD_D == 0 => bitset |= MOD_D,
            b'W' if bitset & MOD_W == 0 => bitset |= MOD_W,
            b'e' if bitset & MOD_e == 0 => bitset |= MOD_e,
            b'p' if ps < 3 => ps += 1,
            _ => {
                return false;
            }
        }
    }
    true
}
let event = event_with_mods.map(|(s, rest)| match s.rsplit_once(':') {
    None => (s, rest),
    Some((pre, post)) if is_likely_event_modifier(post) => (pre, rest),
    Some(_) => (s, rest),
});
// if event.1 is non-empty, then we have post-event

jonhoo avatar Aug 06 '23 10:08 jonhoo