purescript-native cpp binary size

I was surprised to find the optimized size of the binary that simply prints hello was 4.3M. A similar program in cpp is 12K, go is 1.2M. I inspected the output folder and the Makefile to see what was being built, and after wading through a sea of segfaults, I narrowed down the SRCS variable to just what was needed:

  SRCS := $(call rwildcard,$(CC_SRC)/Main/,*.cpp)
  SRCS += $(call rwildcard,$(CC_SRC)/Data_Show/,*.cpp)
  SRCS += $(call rwildcard,$(CC_SRC)/Data_Symbol/,*.cpp)
  SRCS += $(call rwildcard,$(CC_SRC)/Record_Unsafe/,*.cpp)
  SRCS += $(call rwildcard,$(CC_SRC)/Effect_Console/,*.cpp)
  SRCS += $(call rwildcard,$(CC_SRC)/Type_Data_RowList/,*.cpp)
  SRCS += $(call rwildcard,$(CC_SRC)/purescript.cpp)
  SRCS += $(call rwildcard,$(FFI_SRC)/console/,*.cpp)
  SRCS += $(call rwildcard,$(FFI_SRC)/unsafe_coerce/,*.cpp)
  SRCS += $(call rwildcard,$(FFI_SRC)/prelude/,*.cpp)

With this change, the resulting binary is 368k. This is a much more attractive base size, especially for using for small utilities like kubernetes services, but also other applications like an ios game targeting objc++.

The Makefile currently uses spago sources to figure out what dependencies to build corefn for. Would it be possible to do soemthing similar to narrow down the SRCS as well?

Dec 30 '20 14:12 joprice

Yeah, only building the relevant FFI files has been on the todo list (even if implicitly). It would also reduce build times.

Dec 30 '20 20:12 andyarvanitis

@joprice do you have any thoughts as to was image size is ultimately achievable for a hello-world application? Can it reach C++ image size?

Image size that small seems to be a non-goal of pretty much every project I could find. Wasm, Nim, Go, Rush, Kotlin, Haskell, etc. Many languages target c++ to target both performance, size and potentially running on hardware where that's the only real option. But there is a very small niche for when image size matters.

What I don't typically see in these other projects is an analysis as to why the size is what it is. Once it's deemed acceptable, and small enough, no further analysis is done, it seems. What you did is explain at least partially why the smallest image is 4mb as uploaded to 1.2M. I wish I see that type or analysis more often.

What I want to know is how much of the 368k is still needed. Why is it there. Etc. But also whether there is a constraint that is not yet brought up. For instance, let's say there's some loader that has to take at least 50k, and there's no way around it. So it's theoretically not possible to make it any smaller than that and still call it PureScript. Is there such a limit?

Jun 10 '21 15:06 kasajian

The granularity is currently on the module level. If you use a single foreign function, then you need to add the that module's generated code to the build. That module defines it's foreign functions by populating a map at runtime, e.g. https://github.com/andyarvanitis/purescript-native-cpp-ffi/blob/master/functions/data-functions-uncurried.cpp#L8. There is nothing connecting the module usage of the application, or the function usage to foreign modules or functions, so unused functions in a module will still appear in the binary.

The fix for the module issue that I showed above could be part of a some build process that parses your code and lists all imported modules. Maybe some existing compiler tooling can assist there.

To prune unused functions, the foreign interface might have to be modified to allow pruning the unused functions from the go/cpp code. Maybe it could be a preprocessing step making use of comments or c pre-processor style defines.

Jun 10 '21 16:06 joprice

Thanks. I ask because I want to know if the potential is there to make it as small as possible and when hit that limit, understand why it can't go any further. Thanks.

Jun 10 '21 17:06 kasajian

purescript-native purescript-native copied to clipboard

cpp binary size

purescript-native
purescript-native copied to clipboard