serde Optimize internally tagged enums -- do not use internal buffer if tag is the first field

Optimize internally tagged enums -- do not use internal buffer if tag is the first field

Open Mingun opened this issue 3 years ago • 7 comments

Fixes #1495

Also that change has one positive side-effect: if tag is the first field, negative effects from #1183 is eliminated, because buffering is not used

@RReverser, feel free to try to run your benchmarks against this branch

Nov 03 '20 20:11 Mingun

I cannot agree with such a question. For me, the runtime performance is much more important thing that compile-time performance. Things are written not for the pleasure of developers, but for solving customer problems.

Would you be able to provide some measurements showing the impact of this change on the time to compile internally tagged enums?

I will try to study how to do performance measurements, but any guidance is welcome

Feb 28 '21 10:02 Mingun

Overall I would rather optimize for lowering their compile time, not on performance or features at the expense of compile time.

That sounds odd tbh. Compile times are affecting only developers, while runtime affects every user of the library / application, which is way more impactful. Why choose compile-time over runtime perf here when we don't do that at any other levels of development (e.g. opt-level = 0 vs opt-level = 2 etc.)?

Feb 28 '21 21:02 RReverser

I've made some research and there is the results. I've created a library project with 1000 types and I've measured compilation time.

I've noticed small increasing of the compilation time, about 0.01 sec per type (or 7-30%). I think it is acceptable worth for the bigger runtime performance.

Test code and raw data

serde-perf.zip

Created library cargo project with following lib.rs content:

use serde::{Deserialize};
macro_rules! generate {
  ($(#[$counter:meta])*) => {
    $(
      const _: () = {
        #[$counter]
        #[derive(Deserialize)]
        #[serde(tag = "tag")]
        enum Node {
          Unit,
          Struct {
            name: String,
            list: Vec<Node>,
          },
          // Uncomment for "big enum" tests
          /*
          Newtype1(std::collections::HashMap<String, String>),
          Newtype2(String),
          Newtype3(u32),
          Newtype4(f32),
          Unit1,
          Unit2,
          Unit3,
          Unit4,
          Struct1 { f1: String, f2: u32, f3: bool, f4: f64 },
          Struct2 { f1: String, f2: u32, f3: bool, f4: f64 },
          Struct3 { f1: String, f2: u32, f3: bool, f4: f64 },
          Struct4 { f1: String, f2: u32, f3: bool, f4: f64 },// */
        }
      };
    )*
  };
}
// Expanded manually for "expand" tests
generate!(
  /// ...
  /// 1000 lines
  /// ...
);

Tests run with command

cargo +nightly build -Ztimings

Test PC

OS version: Windows_NT x64 10.0.18363 CPUs: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (8 x 1992)

Summary table

Small enum (2 variants), 1000 types

Derived both Serialize and Deserialize, types generated with generate! macro

Deserialize + Serialize	master (c26101532509477132721a8c56b7024891aaf1b4) (100%)	PR (a169beee0d84fa0cad022357c974d05c223e32a0)	Diff
all	47.37 = (46.96 + 47.10 + 48.06)/3	58.67 = (59.27 + 59.17 + 57.56)/3	+11.29 (+29%)
codegen	3.28 = (46.96-43.95 + 47.10-43.92 + 48.06-44.42)/3	4.81 = (59.27-53.93 + 59.17-53.85 + 57.56-53.80)/3	+1.53 (+10%)

Derived only Deserialize, types generated with generate! macro

Deserialize	master (c26101532509477132721a8c56b7024891aaf1b4) (100%)	PR (a169beee0d84fa0cad022357c974d05c223e32a0)	Diff
all	39.25 = (39.16 + 39.66 + 38.94)/3	50.08 = (49.70 + 51.62 + 48.93)/3	+10.83 (+28%)
codegen	3.62 = (39.16-35.31 + 39.66-35.28 + 38.94-36.31)/3	4.53 = (49.70-46.17 + 51.62-45.29 + 48.93-45.21)/3	+0.91 (+25%)

Derived only Deserialize, types written manually

Deserialize + expanded	master (c26101532509477132721a8c56b7024891aaf1b4) (100%)	PR (a169beee0d84fa0cad022357c974d05c223e32a0)	Diff
all	38.76 = (38.38 + 39.67 + 38.23)/3	50.36 = (50.34 + 50.59 + 50.14)/3	+11.60 (+30%)
codegen	3.24 = (38.38-35.41 + 39.67-35.65 + 38.23-35.50)/3	5.67 = (50.34-44.34 + 50.59-44.74 + 50.14-45.02)/3	+2.42 (+75%)

Big enum (14 variants), 1000 types

Derived only Deserialize, types generated with generate! macro

Deserialize	master (c26101532509477132721a8c56b7024891aaf1b4) (100%)	PR (a169beee0d84fa0cad022357c974d05c223e32a0)	Diff
all	241.04 = (236.94 + 239.35 + 246.84)/3	257.01 = (257.75 + 258.76 + 254.70)/3	+16.03 (+7%)
codegen	40.90 = (236.94-197.11 + 239.35-199.61 + 246.84-203.70)/3	46.63 = (257.75-211.25 + 258.76-210.17 + 254.70-209.91)/3	+5.72 (+14%)

Derived only Deserialize, types written manually

Deserialize + expanded	master (c26101532509477132721a8c56b7024891aaf1b4) (100%)	PR (a169beee0d84fa0cad022357c974d05c223e32a0)	Diff
all	238.41 = (238.49 + 236.58 + 240.16)/3	254.86 = (258.93 + 254.75 + 250.90)/3	+16.45 (+7%)
codegen	39.70 = (238.49-198.46 + 236.58-199.66 + 240.16-198.02)/3	45.05 = (258.93-209.91 + 254.75-211.19 + 250.90-208.34)/3	+5.35 (+13%)

Mar 06 '21 16:03 Mingun

@Mingun What about runtime performance improvements?

Mar 26 '21 17:03 pickfire

I didn't measure it, maybe I should

Mar 26 '21 18:03 Mingun

Would this also fix the incorrect error messages caused by the internal buffer? https://github.com/serde-rs/serde/issues/1621

Jul 26 '22 15:07 jarredholman

For the optimized case yes, it should

Jul 26 '22 16:07 Mingun

serde serde copied to clipboard

Optimize internally tagged enums -- do not use internal buffer if tag is the first field

Test PC

Small enum (2 variants), 1000 types

Big enum (14 variants), 1000 types

serde
serde copied to clipboard