flamegraph icon indicating copy to clipboard operation
flamegraph copied to clipboard

unable to collapse generated profile data: Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }

Open jjyr opened this issue 5 years ago • 6 comments

Randomly panic on my mac

thread 'main' panicked at 'unable to collapse generated profile data: Custom { kind: InvalidData, error: StringError("stream did not contain valid UTF-8") }', src/libcore/result.rs:997:5
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
   1: std::sys_common::backtrace::_print
   2: std::panicking::default_hook::{{closure}}
   3: std::panicking::default_hook
   4: std::panicking::rust_panic_with_hook
   5: std::panicking::continue_panic_fmt
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::result::unwrap_failed
   9: flamegraph::generate_flamegraph_by_running_command
  10: cargo_flamegraph::main
  11: std::rt::lang_start::{{closure}}
  12: std::panicking::try::do_call
  13: __rust_maybe_catch_panic
  14: std::rt::lang_start_internal
  15: main

jjyr avatar Apr 25 '19 03:04 jjyr

Happening on mine too for all runs

thread 'main' panicked at 'unable to collapse generated profile data: Custom { kind: InvalidData, error: "stream did not contain valid UTF-8" }', /Users/logan/.cargo/registry/src/github.com-1ecc6299db9ec823/flamegraph-0.3.0/src/lib.rs:236:5
stack backtrace:
   0: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
   1: core::fmt::write
   2: std::io::Write::write_fmt
   3: std::panicking::default_hook::{{closure}}
   4: std::panicking::default_hook
   5: std::panicking::rust_panic_with_hook
   6: rust_begin_unwind
   7: core::panicking::panic_fmt
   8: core::option::expect_none_failed
   9: flamegraph::generate_flamegraph_for_workload
  10: cargo_flamegraph::main
  11: std::rt::lang_start::{{closure}}
  12: std::rt::lang_start_internal
  13: main

ghost avatar Jun 20 '20 00:06 ghost

I hit this too. I'm not sure yet what's causing it, but here are some notes:

First, I commented out the line that deletes the raw dtrace dump:

@@ -163,11 +166,11 @@ mod arch {
              output file cargo-flamegraph.stacks",
         );
 
-        std::fs::remove_file("cargo-flamegraph.stacks")
-            .expect(
-                "unable to remove cargo-flamegraph.stacks \
-                 temporary file",
-            );
+        //std::fs::remove_file("cargo-flamegraph.stacks")
+        //    .expect(
+        //        "unable to remove cargo-flamegraph.stacks \
+        //         temporary file",
+        //    );
 
         buf
     }

I tried specifying the encoding to be utf-8 in case that wasn't the default, but still get the same error sometimes.

Looking at the file, it mostly looks ok. How to track down the offending non-utf8 chars? Maybe roundtrip through iconv?

$ iconv
[offending_line.stacks.zip](https://github.com/flamegraph-rs/flamegraph/files/5320626/offending_line.stacks.zip)
 -f utf-8 -t utf-8 < cargo-flamegraph.stacks > cargo-flamegraph.stacks.reencoded
iconv: (stdin):154433:1054: cannot convert

Ok - so maybe something is wrong on line 154433, let's take a look:

$ head -n 154433 < cargo-flamegraph.stacks | tail -n 1 > cargo-flamegraph-offending-line.stacks

The offending line looks something like this (though obviously I can't paste invalid unicode into GH's text area (here it is zipped if you're feeling brave: offending_line.stacks.zip):

libobjc.A.dylib`bool objc::DenseMapBase<objc::DenseMap<DisguisedPtr<objc_object>, objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> >, objc::DenseMapValueInfo<objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> > >, objc::DenseMapInfo<DisguisedPtr<objc_object> >, objc::detail::DenseMapPair<DisguisedPtr<objc_object>, objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> > > >, DisguisedPtr<objc_object>, objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> >, objc::D+0x64

Hmm.. so maybe it's some exotic encoding - what does chardetect think it might be?

$ chardetect cargo-flamegraph-offending-line.stacks 
cargo-flamegraph-offending-line.stacks: ISO-8859-1 with confidence 0.73

Ok, maybe it's ISO-8859-1? Let's try to convert:

$ iconv -f ISO-8859-1 -t utf-8 < cargo-flamegraph-offending-line.stacks 
              libobjc.A.dylib`bool objc::DenseMapBase<objc::DenseMap<DisguisedPtr<objc_object>, objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> >, objc::DenseMapValueInfo<objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> > >, objc::DenseMapInfo<DisguisedPtr<objc_object> >, objc::detail::DenseMapPair<DisguisedPtr<objc_object>, objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> > > >, DisguisedPtr<objc_object>, objc::DenseMap<void const*, objc::ObjcAssociation, objc::DenseMapValueInfo<objc::ObjcAssociation>, objc::DenseMapInfo<void const*>, objc::detail::DenseMapPair<void const*, objc::ObjcAssociation> >, objc::D»

Note in particular, the last entry: "objc::D»"

I've gone through this exercise a few times, and do not always get the same guessed encoding, which makes me think this might be some kind of corruption rather than dtrace wittingly using an obscure encoding, but who knows 🤷

michaelkirk avatar Oct 02 '20 22:10 michaelkirk

I'm also on a mac btw (10.15)

$ dtrace -V
dtrace: Sun D 1.15

Is anyone hitting this not on a mac?

michaelkirk avatar Oct 02 '20 22:10 michaelkirk

I have a workaround at https://github.com/flamegraph-rs/flamegraph/pull/101, it would be interesting if anyone who frequently experiences this error could give it a whirl.

michaelkirk avatar Oct 02 '20 23:10 michaelkirk

Assuming it's a bug that dtrace ever outputs invalid utf-8, I filed a radar (rdar://8800290) and duped to open radar here: https://openradar.appspot.com/radar?id=5013532726788096

michaelkirk avatar Oct 14 '20 16:10 michaelkirk

btw I am just seeing this now but I added support for non-utf8 in inferno here: https://github.com/jonhoo/inferno/pull/196

It's not released nor version bumped in this repo, but if you need something to work with without having to fork yourself, can use this for the time being:

cargo install --git "https://github.com/austinabell/flamegraph"

I made the change in inferno so it can be used other than just in this repo

austinabell avatar Nov 30 '20 14:11 austinabell