Improved debugging QOL
Is your feature request related to a problem? Please describe. Debugging using std::cout and its friends works but is rather limited compared to using more standard debugging tools/features. In particular, I find myself missing two features when working on LDMX-sw (especially when something crashes!), a dedicated debugger and the various sanitizers (in particular address and undefined behavior, see https://github.com/google/sanitizers).
I also believe that supporting these things can lead to a much less unpleasant experience for people that aren't familiar with the framework when things go wrong. Rather than having to know how all the pieces fit together so you know where to place your std::cout statements, running with sanitizers enabled can often directly pinpoint where the problem is located. For example, I had a lovely time trying to figure out the cause of intermittent crashes caused by a hard coded string literal in the prototype geometry code (I get what I deserve for writing that in the first place!). After working on it for ~2 hours, I went ahead and made a bare metal setup with sanitizers enabled. UBsan caught the issue directly and provided local information about the problem.
Describe the solution you'd like
I would like to see two updates to the container environment with corresponding support in the build setup (i.e. the Cmake module). These are technically distinct so if people are fine with GDB but don't want sanitizer support in the container then that's possible.
- Adding support for using a basic debugger. This would require
- Adding GDB to the container
- Adding support for building LDMX-sw with sanitizers enabled
- Adding the runtime libraries required by the sanitizers (note that the compile time instrumentation features come with the compiler already)
- Add corresponding options in the CMake setup to let people be able to use sanitizers without having to fiddle around with compilation flags directly
Describe alternatives you've considered Keeping on having a bare metal version of LDMX-sw lying around which isn't ideal since
- The benefits of the container workflow is lost
- I need to maintain two separate setups on my machine
- I can't teach people how to do this easily but I can tell someone to enable some CMake flags
Additional context I've already tried this out and both the required changes for the container and the CMake modules are straight forward. If we want to do this, I can create issues in the corresponding repositories and describe the (brief) technical details there
There was some brief discussion regarding this in the #software-dev channel on the LDMX slack, December 15th
God fortsättning as we say in Sweden :)
Realized I didn't really explain what the sanitizers are and the documentation isn't... great if you don't already know what they are doing.
This talk (link should bring you to 42:00) gives a demo of some of the tools. I should note here that the talk is a couple of years old and, in particular, GCC comes with the same functionality these days https://youtu.be/uZI_Qla4pNA?t=2521
I think this is a superb idea. I would love to modernize our tool set within the container and I really appreciate the thorough explanation of your use of debuggers/santizers in this issue and #1044 (since I have no experience with them).
In #1044 you mentioned that you had a local container with the extra tools you were using. Please upload your Dockerfile to a new issue in LDMX-Software/docker so I can get started integrating them. Or if you are interested, feel free to make your own branch there and try integrating them yourself (I'll still need to review it on a PR, but you should be able to modify branches and have the actions build+push a new image for you to the ldmx DockerHub.)
For the additional CMake parameters, is this something you are familiar with already? I am familiar with our current CMake setup, but I would need to do more googling to set custom compile flags or do other configuration.
I'd love to try and creating a branch myself, just to get familiar with the workflow. Regarding the CMake support, I already have this ready as well so I'll open a branch/issue for it :)
hey @EinarElen what's missing in this issue? I'm thinking maybe we could have a just command that does ASAN and UBSAN, I can take care of that. Should we do the same for GBD? For that, can you tell me how you run it? I think if we have those commands in just, this issue is done with all the things you guys did back in 2021, right?
The issue should definitely be done!
I think a just debug recipe would be nice. The most basic version would be to just run gdb fire args. A slightly more involved/sophisticated version would be to first build and install ldmx-sw in debug mode and then run it
@EinarElen I made PR1474. I'm happy with the ASAN/UBSAN part, but the GDB part, I'm unsure how to use with fire. Do you have some example you used in the past (before just et al)?
I would just run
ldmx gdb fire nameofmyconfigfile.py
Type run to start execution, add breakpoints etc
@EinarElen did this work for you in the past? I now see something about "fire" no existing, as in gdb is ready to run on the fire object alone, it like assumes fire is an executable. But also I did the DEBUG build, so that should keep executables around right? How did you ran that in the past?