Parser uses too much flash
Because of the way that it has been structured, the compiler can't optimize everything away. This makes some code appear multiple times in the binary
Hi,
About this issue, first of all we should check the size that the crate is using. Using cargo bloat:
cargo bloat --release --filter at_commands
File .text Size Crate Name
0.0% 1.4% 672B at_commands at_commands::formatter::write_int
0.0% 0.8% 382B at_commands at_commands::parser::CommandParser<D>::expect_raw_string
0.0% 0.8% 354B at_commands at_commands::parser::CommandParser<D>::expect_raw_string
0.0% 0.4% 184B at_commands at_commands::formatter::parse_int
0.0% 0.1% 46B at_commands at_commands::builder::CommandBuilder<at_commands::builder::Set>::with_empty_parameter
0.0% 0.1% 28B at_commands at_commands::builder::CommandBuilder<at_commands::builder::Uninitialized>::create_query
0.0% 3.6% 1.6KiB filtered data size, the file size is 3.9MiB
And now the total usage by crates:
cargo bloat --release --crates
File .text Size Crate
0.4% 33.0% 14.9KiB sim7020
0.2% 19.1% 8.7KiB test_sim
0.2% 18.4% 8.3KiB std
0.2% 14.3% 6.5KiB embassy_rp
0.0% 3.6% 1.6KiB at_commands
0.0% 2.6% 1.2KiB embassy_executor
0.0% 2.4% 1.1KiB defmt_rtt
0.0% 1.4% 672B [Unknown]
0.0% 1.4% 628B defmt
0.0% 0.9% 412B embassy_hal_internal
0.0% 0.9% 408B embedded_io_async
0.0% 0.6% 270B embassy_time_queue_utils
0.0% 0.4% 208B embassy_time
0.0% 0.4% 164B embassy_sync
0.0% 0.2% 100B panic_probe
0.0% 0.2% 82B log
0.0% 0.2% 74B cortex_m_rt
0.0% 0.1% 42B embedded_io
0.0% 0.0% 12B embassy_executor_timer_queue
1.1% 100.0% 45.3KiB .text section size, the file size is 3.9MiB
Note: This is a simple sample project of mine, which results are probably not significative at all, but we could use them for a quick look.
I've been thinking about this issue and I have some hypothesis about it.
The first problem that comes to my mind could be the Rust monomorphization, which will generate a specific implementation of each function for each type that is calling it.
One possible solution may be using a dynamic dispatch using the dyn keyword. In theroy this could help decrease the binary size in exchange of some performance cost due the dynamic dispatch of the function.
Seeing the information from the cargo bloat this should help remove one of expect_raw_string functions.
Another problem seems to be the write_int function, which seems to use a lot of space.
After all of this, I will try to check the write_in function to see if can be improved in some way. Also it could be a good idea to find a more real project that depends on the library and check the size with cargo bloat to gather more information.
About the dynamic dispatch I will try to create a testing version that replaces the static dyspatch with the dynamic dispatch and see how does it improve.
I have checking some of the code and I realized that expect_raw_string is from the parser not the builder as I thought, my bad about that. I think that will make really dificult to use the dynamic dispatch
I also have tryed to improve the write_int behaviour, here is my try: https://github.com/JJaviMS/at-commands/blob/5fb3b88596e61ad2a0c9bc00c4da79ad8c2bee2a/src/formatter.rs#L11. Even though after doing that changes the size of the function remains the same.
Another update on this:
I forgot to add some flags to the release build to minimize the output size, after adding:
[profile.release]
lto = true
opt-level = "z" # Optimize for size.
The binary size is the following:
0.2% 0.8% 208B at_commands at_commands::formatter::write_int
0.1% 0.3% 80B at_commands at_commands::builder::CommandBuilder<at_commands::builder::Set>::with_string_parameter
0.1% 0.3% 76B at_commands at_commands::formatter::parse_int
0.0% 0.2% 40B at_commands at_commands::builder::CommandBuilder<at_commands::builder::Set>::with_optional_string_parameter
0.0% 0.2% 40B at_commands at_commands::parser::CommandParser<D>::trim_space
0.0% 0.1% 34B at_commands at_commands::builder::CommandBuilder<ANY>::try_append_data
0.0% 0.1% 34B at_commands at_commands::builder::CommandBuilder<ANY>::try_append_data
0.0% 0.1% 34B at_commands at_commands::builder::CommandBuilder<ANY>::try_append_data
0.0% 0.1% 34B at_commands at_commands::builder::CommandBuilder<ANY>::try_append_data
0.0% 0.1% 34B at_commands at_commands::builder::CommandBuilder<ANY>::try_append_data
0.0% 0.1% 32B at_commands at_commands::builder::CommandBuilder<at_commands::builder::Uninitialized>::create_query
0.0% 0.1% 32B at_commands at_commands::builder::CommandBuilder<at_commands::builder::Uninitialized>::create_set
0.5% 2.6% 678B filtered data size, the file size is 124.3KiB
And by crate:
File .text Size Crate
9.3% 44.6% 11.6KiB test_sim
3.9% 18.5% 4.8KiB std
2.4% 11.3% 2.9KiB sim7020
1.4% 6.6% 1.7KiB [Unknown]
0.9% 4.4% 1.1KiB embassy_rp
0.6% 2.7% 714B at_commands
0.5% 2.3% 616B defmt_rtt
0.4% 1.9% 500B defmt
0.4% 1.9% 498B embassy_executor
0.3% 1.4% 384B embedded_io_async
0.3% 1.2% 324B embassy_sync
0.2% 1.1% 304B panic_probe
0.1% 0.6% 156B embassy_hal_internal
0.1% 0.5% 138B embassy_time
0.1% 0.4% 118B embassy_time_queue_utils
0.1% 0.3% 74B cortex_m_rt
20.9% 100.0% 25.9KiB .text section size, the file size is 124.3KiB
Hi, thanks for looking into it!
Yeah the main issue is the generics. To fix the binary size, the logic must be split from the generics. Maybe this is actually kinda easy. I wrote this issue in 2020 when I only had about a year of experience with Rust :P
Also, I'm not using this crate a whole bunch, so this problem isn't relevant enough for me to tackle. But I'll give it a shot now I guess.
Hmmm, so first try actually makes it worse. Seems like the optimizer is able to deal with it quite well. Might not be as relevant anymore as it was in 2020