DLKit_appImagesContain is slow
Hi @johnno1962, one of our devs complained that loading the injection bundle on startup freezes the main thread for around 1 second.
I've ran this under a profiler, and it seems like the issue stems from the the call to DLKit_appImagesContain which spends quite a long time on ImageSymbols::trie_populate()
seems like most of the time is spent on
std::__1::map<void const*, char const*, std::__1::less<void const*>, std::__1::allocator<std::__1::pair<void const* const, char const*>>>::operator[](void const* const&)
maybe this can be solved by replacing the map with std::unordered_map?
other stuff that takes less time (and might be harder to optimize) other things that take time
170.00 ms 10.6% exportsTrieTraverse 112.00 ms 7.0% void std::__1::sort[abi:de180100]<std::__1::__wrap_iter<TrieSymbol*>>(std::__1::__wrap_iter<TrieSymbol*>, std::__1::__wrap_iter<TrieSymbol*>) 37.00 ms 2.3% std::__1::map<void const*, char const*, std::__1::less<void const*>, std::__1::allocator<std::__1::pair<void const* const, char const*>>>::~mapabi:de180100
see attached screenshot from the profiler
cc @NirAmzaleg
Hi, have you tried an unordered_map and profiled it? You can edit DLKit directly in the repo as it is a submodule.
P.S. You should be able to set an environment variable in your scheme INJECTION_NOKEYPATHS to avoid this delay.
I've created a branch DLKit_appImagesContain on InjectionNext if you want to try it out. It contains both your fix and a fix where you don't need to populate all symbols in the "trie" to check for the presence of a symbol. https://github.com/johnno1962/DLKit/commit/e9487abc27a1ce57110fd1c5723c59c64ee9e68f
thanks @johnno1962 ! we will check this out
I've prepared a new beta. This was a small change to some code that couldn't handle your scale. You might also be interested in https://github.com/llvm/llvm-project/pull/147134 which likely makes llvm's "lld" linker faster than the Apple one for large projects. It could do with some testing. You swap the linker in using the -fuse-ld=/path/to/linker "Other Linker Flags". You build the new linker with cmake then ninja in the build directory details.
cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS="clang;lld"
Hey! I've I've rebuilt InjectionNext using https://github.com/johnno1962/DLKit/commit/e9487abc27a1ce57110fd1c5723c59c64ee9e68f and it seems like the issue is now fixed 🎉
I've prepared a new beta. This was a small change to some code that couldn't handle your scale. You might also be interested in llvm/llvm-project#147134 which likely makes llvm's "lld" linker faster than the Apple one for large projects. It could do with some testing. You swap the linker in using the -fuse-ld=/path/to/linker "Other Linker Flags". You build the new linker with cmake then ninja in the build directory details.
cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS="clang;lld"
Regarding this ... do you suggest rebuilding InjectionNext with this? or our large app? I mean, do you think that'll make our app linkage faster?
It may be worth trying out with the link of your large app as it may improve iteration times. I can provide a binary to get you started if you can work out how to re-codesign it. You use the ld64.lld link which should be compatible with Apple's linker.
https://johnholdsworth.com/lld.tgz
I've updated the tar file to include a codesigned binary and a homebrew library you may need. I particularly interested in whether a) the new linker works with your monster project and b) if it is faster than the default linker which would be rad.
P.S. to get the best of the new linker you need to add -fuse-ld=/path/to/ld64.lld -Xlinker --read-threads=20 to the Debug config, "Other Linker Flags".