swift-argument-parser icon indicating copy to clipboard operation
swift-argument-parser copied to clipboard

Including package increases binary size by ~2,7MB

Open eaigner opened this issue 3 years ago • 10 comments

Just including the argument parser raises my CLI tool binary size from 250KB to about 3MB

This seems a bit excessive.

ArgumentParser version: main Swift version: Apple Swift version 5.3.2 (swiftlang-1200.0.45 clang-1200.0.32.28) Target: arm64-apple-darwin20.3.0

eaigner avatar Apr 03 '21 14:04 eaigner

Couple questions and suggestions:

  • Did you build with -Osize?
  • Have you tried running strip <binary>? (swift binaries contain a ton of symbols data)
  • Is your binary "fat"? (you can see which architectures are included by running file <binary> and strip out architectures using lipo)

rauhul avatar Apr 04 '21 00:04 rauhul

I see a similar increase, from 1.9M to 4.5M. Both versions have similar argument handling.

When compiling for release, as I did, the default is for clang to use -Os which is supposed to compile for speed and size, swift to use -O and the symbols are put into a dSYM file rather than embedding them as with a debug compile. Using -Osize for swift made no significant difference in size. I compiled for "My Mac" which means arm64 only as file confirms.

N.B. strip invalidates the signature so even if it is smaller you can't run it.

msalmonse avatar May 17 '21 15:05 msalmonse

N.B. strip invalidates the signature so even if it is smaller you can't run it.

You can either resign the binary with codesign or--if you're building with an xcodeproject--use the strip settings under the build settings tab: Screen Shot 2021-05-17 at 6 41 22 AM

rauhul avatar May 17 '21 16:05 rauhul

Lets not dance around the problem. The problem is that the binary size is this big, despite being stripped and optimized.

And if you have to include it in multiple CLI targets that are bundled with your app, which is the case in my project, the problem even multiplies.

eaigner avatar May 17 '21 16:05 eaigner

I agree that swift-argument-parser does greatly inflate binary sizes and that this is a problem worth addressing. However, step one understanding how much binary executable data the library adds.

For example, swift binaries contain a lot of metadata and symbol names, there's not that much this project can do to remedy that if symbol names are a large issue. Alternatively, perhaps the bloat is coming from lots of string data, or maybe its just code size.

If you could provide numbers breaking down where the binary size is coming from, it would be a helpful step to improving it.

rauhul avatar May 17 '21 16:05 rauhul

Running bloaty on the example math command has some interesting results:

➜  bloaty math
    FILE SIZE        VM SIZE
 --------------  --------------
  41.8%   680Ki  40.9%   680Ki    String Table
  28.1%   456Ki  27.4%   456Ki    __TEXT,__text
  17.9%   291Ki  17.5%   291Ki    Symbol Table
   2.1%  33.4Ki   2.1%  34.3Ki    [30 Others]
   0.0%       0   1.6%  26.9Ki    __DATA,__bss
   1.2%  19.8Ki   1.2%  19.8Ki    __DATA_CONST,__const
   1.1%  18.5Ki   1.1%  18.5Ki    __TEXT,__const
   1.1%  17.5Ki   1.1%  17.5Ki    Binding Info
   0.8%  13.0Ki   0.8%  13.0Ki    Lazy Binding Info
   0.8%  12.8Ki   0.8%  12.8Ki    __TEXT,__eh_frame
   0.8%  12.7Ki   0.8%  12.7Ki    Code Signature
   0.8%  12.2Ki   0.7%  12.2Ki    Export Info
   0.4%  7.20Ki   0.7%  11.3Ki    [__DATA]
   0.7%  10.7Ki   0.6%  10.7Ki    [__DATA_CONST]
   0.6%  9.88Ki   0.6%  9.98Ki    [__TEXT]
   0.6%  9.13Ki   0.5%  9.13Ki    __TEXT,__unwind_info
   0.4%  6.72Ki   0.4%  6.72Ki    __TEXT,__cstring
   0.0%       8   0.3%  5.29Ki    [__LINKEDIT]
   0.3%  5.25Ki   0.3%  5.25Ki    __DATA,__data
   0.3%  4.95Ki   0.3%  4.95Ki    __TEXT,__swift5_fieldmd
   0.3%  4.58Ki   0.3%  4.48Ki    [Mach-O Headers]
 100.0%  1.59Mi 100.0%  1.62Mi    TOTAL

It also seems like the linker is not doing a good job of removing duplicated string, which is quite strange:

➜ strings math | sort | wc -l
829
➜ strings math | sort | uniq | wc -l
658

rauhul avatar May 17 '21 17:05 rauhul

One of the tricks you can do is:

strings ArgumentParser.o|sort|uniq -c|sort -n

The most common string is ArgumentParser

msalmonse avatar May 18 '21 04:05 msalmonse

👋🏻 Thanks for opening this issue, @eaigner, and for the ensuing discussion! I've been looking into this a little — here are my notes:

  • The 2.7MB size increase is when an executable is built in debug mode
  • When built in release mode, the increase is roughly 1.6MB
  • As @rauhul noted, the vast majority of that size is the symbol table, the __text section (i.e. the executable code), and the string table
  • Using strip on the resulting binary removes nearly all of the symbol and string tables, cutting the size down by another 0.9MB:
    (main|✔) $ bloaty math_stripped -- math 
        FILE SIZE        VM SIZE    
     --------------  -------------- 
      [ = ]       0  [DEL] -15.7Ki    [__LINKEDIT]
     -96.1%  -275Ki -96.1%  -275Ki    Symbol Table
     -96.0%  -642Ki -96.0%  -642Ki    String Table
     -57.4%  -918Ki -56.7%  -934Ki    TOTAL
    
  • From what I can see, the removed bits there are a bunch of type metadata and the witnesses for unused conformances, like the internal ParsedArgument type's conformance to Equatable. I was concerned about whether stripping this metadata would cause problems due to ArgumentParser's heavy use of reflection, but it doesn't seem like the resulting binary is missing anything. The generated completion scripts (which pretty much exercise the entirety of the declared types) match between stripped and unstripped executables.

@jckarter had a forum post in November about some things the compiler could do to either strip more dead symbols or make more symbols strippable, so it may be that future Swift versions aren't so verbose.

The largest other piece is the code size, some of which is inherently hard to get rid of due to what ArgumentParser is doing under the hood to enable all the different ways of using the property wrappers. One idea I had was that the completion script generation machinery could be omitted in release versions — these scripts should identical for all instances of an executable, so maybe it should up to the author of the tool to generate and distribute these separately / as part of the installation process.

natecook1000 avatar May 18 '21 23:05 natecook1000

I think that SAP's biggest problem is that it tries to be everything to everybody. In the end you get a product that makes the simple easy and the difficult impossible. I think that SAP goes a long way to making everything easy but once you reach the limit it's a total stop. Rather than satisfying your users you get a list of feature requests.

I gave up on SAP and rolled my own. The beta test object file weighed in at 190KB but it has become infected with feature bloat and after adding usage wrapping and JSON support it's up to a bit over 500KB. My goal was to do as little as possible but no less so I just return a list of strings that the app has to take care of. I found that that simplified things a lot as I now store options in a dict instead of a really big struct.

My way is right for me and probably doesn't suit many others but I'm really glad that I took the time to do it rather than relying on others to solve my problems for me.

msalmonse avatar May 29 '21 13:05 msalmonse

I have discovered a few things about package compilation with xcode.

  1. xcode ignores everything that you set when it compiles packages, about the only thing that gets through is -DDEBUG.
  2. when compiling for Release xcode make a fat library, regardless of your settings, this accounted for about 150KB of my bloat.
  3. you can add swift settings in your package.swift but only defines and flags, unfortunately if you add flags xcode won't import the module. I really have trouble understanding what advantage there could be in making a package unusable.

All in all I'm not that impressed with SPM.

msalmonse avatar Jun 01 '21 16:06 msalmonse