libtock-rs icon indicating copy to clipboard operation
libtock-rs copied to clipboard

#[derive(Debug)] seems expensive

Open jrvanwhy opened this issue 5 years ago • 8 comments

The code generated by #[derive(Debug)] uses core::fmt::DebugStruct and its siblings: DebugList, DebugMap, DebugSet, and DebugTuple. Internally, these types use &dyn fmt::Debug, and it appears that in Rust userspace apps this defeats LLVM's devirtualization. This causes most or all of the Debug implementations (including Debug implementations for types from libcore) to be compiled in.

I ran some tests on a stripped-down libtock-rs that removes all of its #[derive(Debug)] instances. In these tests, I manually derived Debug using the core::fmt::Debug* types as well as implemented it entirely manually. Here are the resulting sizes of .text (in bytes):

  1. DebugStruct: 15664
  2. DebugTuple: 15504
  3. DebugList: 15584
  4. DebugSet: 15584
  5. DebugMap: 16576
  6. Manual implementation (no Debug*): 504

Based on this test, I think it may be wise to avoid #[derive(Debug)] in libtock-rs and instead derive Debug manually. I count 12 uses of #[derive(Debug)] in libtock-rs so this seems reasonable.

Any comments before I make this change?

jrvanwhy avatar Apr 23 '19 18:04 jrvanwhy

This could also be gated by one (or several) feature(s). cfg_attr can be used to conditionally derive Debug:

#[cfg_attr(feature = "derive_debug", derive(Debug))]
struct Foo { ... }

And the release/debug mode (or another feature) could be checked to decide whether to implement Debug at all (whether hand-rolled or derived).

What do you think?

gendx avatar Aug 26 '19 19:08 gendx

I am supportive of using Cargo features to control how much code libtock-rs includes to limit bloat. I'd rather not condition this on debug/release mode, as it is a breaking change to limit trait implementations, and because release/debug-mode binaries don't always make sense in embedded.

Solving this isn't a priority for me yet so my plan is to mitigate issues as they come up. Longer term, something like ufmt is probably a better solution, so that we avoid including libcore's Display/Debug implementations (which have the same issue). However, getting rid of them entirely requires building libcore with the panic_immediate_abort feature flag, which currently requires Xargo. Once Cargo gains the ability to build libcore with that flag (my understanding is this should eventually happen) then avoiding Display/Debug entirely should become very doable.

jrvanwhy avatar Aug 26 '19 20:08 jrvanwhy

I personally would be okay with just removing all of the debug implementation as I don't use them for debugging and we shouldn't use them in production code at all.

torfmaster avatar Nov 13 '19 21:11 torfmaster

I personally would be okay with just removing all of the debug implementation as I don't use them for debugging and we shouldn't use them in production code at all.

This has been affecting our test binaries significantly, where we need Debug or similar functionality. Note that this also applies to Display, although I'm not sure how necessary that is in production builds.

I've been working on code size tooling to help understand how we can mitigate this issue.

jrvanwhy avatar Nov 13 '19 21:11 jrvanwhy

I just found out that we are not the only ones wanting a more lightweight formatting in Rust: https://twitter.com/japaric_io/status/1196203591737991168

There is now the ufmt crate, streamlining a lightweight Debug-like trait: https://crates.io/crates/ufmt

gendx avatar Nov 18 '19 14:11 gendx

I just found out that we are not the only ones wanting a more lightweight formatting in Rust: https://twitter.com/japaric_io/status/1196203591737991168

There is now the ufmt crate, streamlining a lightweight Debug-like trait: https://crates.io/crates/ufmt

If ufmt is a decent fit for libtock-rs, then using ufmt would be ideal because it provides some compatibility with the broader embedded Rust ecosystem. Unfortunately, it has undefined behavior, and does not appear to be actively maintained.

We discussed using ufmt in libtock-rs in the core WG call last week. We decided to do the following:

  1. Vendor ufmt (make a copy in libtock-rs), possibly trimmed down a bit.
  2. Fix the bugs in our vendored version
  3. Use our vendored version in libtock-rs. See if we like it.
  4. If we like it, we can approach ufmt's current maintainer about moving it into a project (Tock or the Rust embedded WG) that has the capacity to maintain it. If we don't like it, we removed the vendored version and write our own alternative.

Note that I still expect to have #[derive(Debug)] on many types because having Debug is tremendously beneficial for unit tests. For example, if you want to use std::assert_eq! on libtock-rs types, then those types need to implement Debug. We'll just have 2 parallel debug print mechanisms: core::fmt::Debug for test code and ufmt (or similar) for Tock process binaries. #[derive(Debug)] does not have overhead if you never use Debug, so that won't cause problematic code bloat (otherwise we would have to be #![no_core]).

jrvanwhy avatar Mar 31 '21 01:03 jrvanwhy

After talking to some Rust people they pointed me to https://github.com/rust-lang/rust/pull/101568 which should help here.

It should allow more optimisations when using the core fmt

alistair23 avatar Sep 14 '22 10:09 alistair23

[..] this defeats LLVM's devirtualization.

https://github.com/rust-lang/rust/pull/101568

Working on it!

(No promises yet, but it looks like at least one of my experimental approaches doesn't break devirtualization.)

m-ou-se avatar Sep 14 '22 12:09 m-ou-se