DLKit icon indicating copy to clipboard operation
DLKit copied to clipboard

DLKit_appImagesContain is slow

Open oryonatan opened this issue 6 months ago • 10 comments

Hi @johnno1962, one of our devs complained that loading the injection bundle on startup freezes the main thread for around 1 second.

I've ran this under a profiler, and it seems like the issue stems from the the call to DLKit_appImagesContain which spends quite a long time on ImageSymbols::trie_populate()

seems like most of the time is spent on std::__1::map<void const*, char const*, std::__1::less<void const*>, std::__1::allocator<std::__1::pair<void const* const, char const*>>>::operator[](void const* const&) maybe this can be solved by replacing the map with std::unordered_map?

other stuff that takes less time (and might be harder to optimize) other things that take time

170.00 ms 10.6% exportsTrieTraverse 112.00 ms  7.0% void std::__1::sort[abi:de180100]<std::__1::__wrap_iter<TrieSymbol*>>(std::__1::__wrap_iter<TrieSymbol*>, std::__1::__wrap_iter<TrieSymbol*>) 37.00 ms  2.3% std::__1::map<void const*, char const*, std::__1::less<void const*>, std::__1::allocator<std::__1::pair<void const* const, char const*>>>::~mapabi:de180100

see attached screenshot from the profiler

Image

oryonatan avatar Jul 06 '25 12:07 oryonatan

cc @NirAmzaleg

oryonatan avatar Jul 06 '25 12:07 oryonatan

Hi, have you tried an unordered_map and profiled it? You can edit DLKit directly in the repo as it is a submodule.

johnno1962 avatar Jul 06 '25 12:07 johnno1962

P.S. You should be able to set an environment variable in your scheme INJECTION_NOKEYPATHS to avoid this delay.

johnno1962 avatar Jul 06 '25 13:07 johnno1962

I've created a branch DLKit_appImagesContain on InjectionNext if you want to try it out. It contains both your fix and a fix where you don't need to populate all symbols in the "trie" to check for the presence of a symbol. https://github.com/johnno1962/DLKit/commit/e9487abc27a1ce57110fd1c5723c59c64ee9e68f

johnno1962 avatar Jul 06 '25 14:07 johnno1962

thanks @johnno1962 ! we will check this out

oryonatan avatar Jul 07 '25 07:07 oryonatan

I've prepared a new beta. This was a small change to some code that couldn't handle your scale. You might also be interested in https://github.com/llvm/llvm-project/pull/147134 which likely makes llvm's "lld" linker faster than the Apple one for large projects. It could do with some testing. You swap the linker in using the -fuse-ld=/path/to/linker "Other Linker Flags". You build the new linker with cmake then ninja in the build directory details.

cmake -S llvm -B build -G Ninja  -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS="clang;lld"

johnno1962 avatar Jul 07 '25 07:07 johnno1962

Hey! I've I've rebuilt InjectionNext using https://github.com/johnno1962/DLKit/commit/e9487abc27a1ce57110fd1c5723c59c64ee9e68f and it seems like the issue is now fixed 🎉

oryonatan avatar Jul 07 '25 09:07 oryonatan

I've prepared a new beta. This was a small change to some code that couldn't handle your scale. You might also be interested in llvm/llvm-project#147134 which likely makes llvm's "lld" linker faster than the Apple one for large projects. It could do with some testing. You swap the linker in using the -fuse-ld=/path/to/linker "Other Linker Flags". You build the new linker with cmake then ninja in the build directory details.

cmake -S llvm -B build -G Ninja  -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS="clang;lld"

Regarding this ... do you suggest rebuilding InjectionNext with this? or our large app? I mean, do you think that'll make our app linkage faster?

oryonatan avatar Jul 07 '25 09:07 oryonatan

It may be worth trying out with the link of your large app as it may improve iteration times. I can provide a binary to get you started if you can work out how to re-codesign it. You use the ld64.lld link which should be compatible with Apple's linker.

https://johnholdsworth.com/lld.tgz

johnno1962 avatar Jul 07 '25 09:07 johnno1962

I've updated the tar file to include a codesigned binary and a homebrew library you may need. I particularly interested in whether a) the new linker works with your monster project and b) if it is faster than the default linker which would be rad. P.S. to get the best of the new linker you need to add -fuse-ld=/path/to/ld64.lld -Xlinker --read-threads=20 to the Debug config, "Other Linker Flags".

johnno1962 avatar Jul 07 '25 15:07 johnno1962