quick-lint-js
quick-lint-js copied to clipboard
Reduce libstdc++/libc++ bloat in statically-linked executables
I think the biggest binary size win would be avoiding statically linking with the C++ runtimes.
On Linux, for example, I noticed that statically linking the C++ runtime is a problem because exception support gets pulled in, even if we don't use exceptions. Exception support in libstdc++ includes a C++ symbol demangler which is bloated.
We've gotten good binary size reductions in the past by avoiding some parts of the C++ standard libraries. See commits 909fae27932ebcffa2d6fa2cba688f8190700e30 and dfbb182065b6325aa1c258b645ab202e1e9af4ff for example.
Audit of a Linux build (libstdc++):
-
std::__throw_length_error
is called bystd::vector
. -
std::__throw_out_of_range_fmt
is called bystd::string::substr
. -
std::__throw_logic_error
is called bystd::string::string
. -
std::__throw_bad_alloc
is called by various things, includingstd::vector
andstd::unordered_map
. -
operator new
andoperator delete
are called by plenty of things. -
std::__detail::_Prime_rehash_policy::_M_next_bkt
,std::__detail::_Prime_rehash_policy::_M_need_rehash
, andstd::_Hash_bytes
are called bystd::unordered_map
. -
__cxa_atexit
,__cxa_guard_acquire
, and__cxa_guard_release
are called byfile_output_stream::get_stderr
,file_output_stream::get_stdout
,basic_configuration_filesystem::canonicalize_path
,parser::check_jsx_attribute
, andboost::container::dtl::singleton_default<boost::container::pmr::new_delete_resource_imp>::instance
.
I used this script to correlate imported (undefined) symbols with their callers:
# Usage: objdump -xd build-size/quick-lint-js | python slurp.py | c++filt
import collections
import re
import sys
callers = collections.defaultdict(set)
current_symbol = "???"
for line in sys.stdin:
line = line.rstrip("\n")
match = re.match(r"^[0-9a-f]{16} <(?P<symbol>.*)>:$", line)
if match is not None:
current_symbol = match.group("symbol")
match = re.match(r".*(call|jmp).*<(?P<symbol>(_Z|__cxa).*@plt)>$", line)
if match is not None:
called_symbol = match.group("symbol")
callers[called_symbol].add(current_symbol)
for callee in sorted(callers.keys()):
print(callee)
for caller in sorted(callers[callee]):
print(f"\t{caller}")
print()
A plan:
- Replace std::string file paths with a custom class.
- Replace other uses of std::string with arrays or vector or something.
- Replace std::vector with boost or another implementation.
- Implement our own lazy-loading singleton.
- If necessary: replace std::unordered_map with boost or another implementation.
For std::vector
, it may be sufficient just to override the the default allocator, since Allocator::allocate()
is one of the main things that can throw exceptions. That way, you don't have to rip std::vector
out completely. The same could be helpful for std::string
and std::unordered_map
as well.
@AbleBacon I recall some range checks too (e.g. for std::vector<>::at
), not just std::bad_alloc
.