ImHex
ImHex copied to clipboard
[Hub] Artifact and cache size
This issue is here to regroup all information about reducing the size of cache and artifacts
LTO seems to influence the ccache cache sizes a lot the ArchLinux ccache cache is 150MB with LTO, and 24MB without See https://gist.github.com/iTrooz/740f00f0935e365534f5a76dab0e7738 to measure section sizes for ELF
building in Release mode instead of RelWithDebInfo helps a lot with artifacts size. For example, Ubuntu 22.04 DEB went from 132MB to 16.2MB. Windows Installer went from 217MB to 24.2MB More information: https://github.com/iTrooz/ImHex/actions/runs/9231528139 and https://github.com/iTrooz/ImHex/actions/runs/9231536431
using -gz=zlib (or fallback on -gz doesn't seem to improve cache sizes (Checked on Ubuntu 22.04 and ArchLinux builds)
Artifacts sizes do not improve either. In fact, AppImage seems to have gone from 141MB to 162MB Windows and MacOS do not support this option.
Note that actual ELF files produced drastically reduce in size (e.g. 140.6MiB to 56.4MiB for libimhex for Ubuntu 22.04). The reason we are not observing changes in artifacts is because packages formats (e.g. .deb, .rpm, .tar.zst..) are already compressed.
NOTE: This means that this optimisation would still be useful once the package installed
More information: https://github.com/iTrooz/ImHex/actions/runs/9231528139 https://github.com/iTrooz/ImHex/actions/runs/9235290159
A complex but definitive solution to artifact size would be to store the debug info of release versions ourselves instead of bundling it in artifacts, and make ImHex upload stacktraces with code offsets to our server, where we could map them to files/lines again.
Some software provides separate pdb file downloads for debugging, is this approch possible for ImHex?
Probably, but your approach is missing some details. Who would download and use these separate debugging files ?
I offer an answer to this in my last comment
Who would download and use these separate debugging files ?
AFAIK, WinDbg, "who" keep downloads symbol files automatically, until the disk is filled
If you have a source please share it, but I'm doubtful it would do that, because its not its purpose. WinDbg is a debugger, why would it even be installed on a user machine, and why would it manage storage
I think Crystal-Rain Slide means that debuggers can have symbol servers defined and when you try to debug code it downloads pdbs for libraries and things you may need. Those are microsoft servers though but you can use any server like a folder or an http address. I think the pdbs are needed by the stack tracer implementation used , so that the debuggers servers are not something that can be used here.
Ohh, I never knew symbol servers existing. That could be a way to solve the problem indeed. But I don't plan to do it right now. It someone wants to build a PoC, please do so. I'm imagining something like a function in ImHex that calls the symbol server when crashing, or an implementation in our the API web server when they receive raw "stacktraces" without symbols from ImHex instances
Some links that seem useful: https://stackoverflow.com/a/35556262 https://docs.sentry.io/platforms/apple/data-management/debug-files/symbol-servers/ https://wiki.archlinux.org/title/Debuginfod (used by ArchLinux for downloading debug info for the libraries in pacman)
I think the pdbs are needed by the stack tracer implementation used , so that the debuggers servers are not something that can be used here.
I'm sure this can be worked around
I don't know much about how the process of creating useful stack traces, but if symbol servers can be used for them then I suppose it would be the natural choice. symbol servers are not exclusive to windows, gdb also supports them and there may be linux servers that can be used as well. Im not 100% sure but i think it is likely.