duma
duma copied to clipboard
update/missing documentation: brief overview, features and comparison
current README.md does IMHO not give a good brief description of duma itself and for what it's good for. something like http://duma.sourceforge.net/ is something, i would have in mind. i'd suggest to insert that page - especially with the bullet points - at the begin of README.md.
above page contains the pdf document http://duma.sourceforge.net/74_75_76_77_78_Debugging_DE.pdf it's german .. but, i'd think it's a good idea to add this document somehow into this github repo - probably not into the sources .. but unsure how/where. github does also offer sort of homepage - but never tried this.
in addition, that more than outdated sourceforge page does not reference this github repo. i could check/try, if i can change this.
IMHO there's a too much history - not just in the history section - in the current README.md. This most of that could be outsourced into some separate document, e.g. history.md.
detailed description of the many environment variables is in the README.txt , which mostly was identical to the man page. think, this could be also converted into a .md and referenced from the primary README.md
some linux distribution like debian/ubuntu (https://packages.ubuntu.com/search?suite=default§ion=all&arch=any&lang=en&keywords=duma&searchon=names) come with their own packages of duma. might be worth mentioning - if not already. but not sure what code base the maintainers use. but i know, that they used to apply a bunch of patches!
this repo does also come with the subfolder https://github.com/johnsonjh/duma/tree/master/comparisons the https://github.com/johnsonjh/duma/blob/master/comparisons/README.txt does refers to a dead page. following link to the wayback machine could be used: https://web.archive.org/web/20090204104251/http://www.cs.utexas.edu/users/jpmartin/memCheckers.html
anyway, the comparison is more than outdated. the progress on linux and gcc in the last decade(s) IMHO makes it necessary to give a simple comparison of duma against at least
- gcc address sanitizer: see https://en.wikipedia.org/wiki/AddressSanitizer and https://github.com/google/sanitizers/wiki/AddressSanitizer
- valgrind with memcheck: see https://valgrind.org/docs/manual/mc-manual.html
- anything free and useful for Windows?
@hayguen
Thanks for the nudge.
Indeed - I agree fully. Also, this is partly exacerbated by the fact that I only rarely look at the files via the web or with a browser. It's also my first instinct to look at a text file in a text editor rather than Markdown (or HTML). Reality, however, is the primary way the project would be found by most people searching for it is with a web browser and the Markdown is the first thing someone would see.
If you didn't know what DUMA was to begin with, it is especially unhelpful in (little amount) of information it conveys.
I almost just replaced the README with the original text file, but, this needs a bit more consideration ... regardless, the current Markdown file does seem more like a blog entry than something that one should read before using.
I'd go as far as saying even as history it's not of any actual use to users.
It's important (in my opinion) to preserve the history and not just remove or modernize for the sake of removing or modernizing, but also we must not give the impression of a "historical" project to someone who lands here with little context.
I'll take a stab at a "fix" for this over this weekend.
(Further specifics to follow)
could manage to update the old sourceforge page http://duma.sourceforge.net/:
- mentioned this github repo https://github.com/johnsonjh/duma/ for updates
- now redirects to https://duma.sourceforge.io/
the project site https://sourceforge.net/projects/duma/ now also refers to this github repo
(overly verbose thinking outloud ...)
Releases / Packaging / Distros
- I think a priority should be a release procedure that is repeatable and verifiable with some automated testing as part of the repo itself. I'm preferring GitLab CI/CD but GitHub now offers self-hosted runners as well. I PGP sign all commits already - I should sign releases rather than just allow GitHub or GitLab to create them based on tags.
- Allowing the various code hosting platforms to roll releases is convenient but problematic. It's not a deterministic process so hashes are going to be different depending on the host/forge. That's not friendly for package management (or users!) which (IMO, rightfully) expect a specific release to have a specific filename and hash regardless of where it came from. It also shouldn't be necessary to use Git to verify the commits.
- There have been issues with Sourceforge inserting unwanted software in releases in the past, so relying on the forge to package isn't a good practice in general.
- Once this is done, it's generally a simple matter to contact the package maintainers and notify them of the new upstream. This should be the ultimate goal for this bullet-point. If a packager needs to make local patches we should see why and try to elide the need for it here, if appropriate.
- I think it makes sense to include tested RPM specfiles and Debian control files in the repository. Probably Homebrew as well.
Windows
- To address Windows ... unfortunately, I have little to no experience with Windows at all. I do have a Windows virtual machine for development purposes and for running tests, but I've installed Vim and Git and Git-Bash and Mingw.
- I must give the disclaimer up front that I'm of that generation where my early school days were all Apple products, and later it was essentially 100% Sun in compsci and VMS everywhere else. My knowledge of Windows starts with knowing that it exists and ending with knowing how to start (and use) Microsoft Excel. I've probably used Windows out of that limited context for less than 2 hours over the course of my lifetime. I'd need to do some research to comment on the Windows situation and I'd still be afraid I'd be woefully uninformed.
- With that said, I see things have changed that I can easily address. For example, Detours is now available under an MIT license making the README in the detours directory is factually incorrect. I'll start by updating the documentation to reflect reality, fix outdated links, etc. Doing serious testing and updating to current the Windows reality is probably beyond my abilities if Windows-specific knowledge is required.
Documentation (not just the README)
- Actually "build" documentation. It would be a big task, but building developer-focused documentation from the sources can be a goal and ensures documentation won't be neglected.
- Doxygen?
- Sphinx?
- Have a master document format from which other documentation is generated from.
- Use Pandoc or similar to generate Markdown, man pages, HTML, etc.
- Build a website / landing page as part of this process. Leverage GitLab Pages, GitHub Page Hosting. I like and know Hugo.
- Simple one-pager based on template engine and then have @hayguen point the sourceforge.io page.
- Makes the project discoverable by "normal" people.
Comparison / Features table / overview
- Absolutely needed.
- A concern is that it can become quickly outdated. Valgrind and friends I use regularly but I'd need to see what the others offer these days and investigate some way to stay up to date.
- I want to avoid a comparison table from looking like a competition. No check the box contests. I really don't want to start radically changing DUMA - I'm not against new features but I see it as just one tool of many in my toolbox so the table should highlight what DUMA does really does well.
here my 2 cents:
release/packaging/distros: unsure about signing, but including rpm spec files and debian control files sounds good.
automatic build/test, e.g. github actions, is surely a good idea.
windows: that was the initial point, why i needed to fork/port efence to windows. later on, several (environment) options and (meanwhile outdated?) c++ suport was added. just FYI
comparison/overview: duma surely isn't the one and only best tool for memory debugging. there are very few scenarios (aspects), where duma is superior to valgrind (speed) and probably also supperior to address sanitizer (also faster - without any checks at runtime; but slower when it comes to allocation). think, it's important to point this out .. and also give the hint to use other tools or ideally refer to some external comparisons.
by the way: the README.md refers to me as the "original author". i would suggest to remove that. with "original author" .. i'm thinking of Bruce Perens .. however, simply refer the persons (if necessary) without attributing them?
On the last point, noted. I do think of you as the author of Duma and Bruce Perens as efence and them being different things at this point :) that whole file is probably going to just replaced by this weekend, but the "original" distinction isn't helpful.
I seem to remember first finding DUMA because of problems using efence at the time with on either HP or Solaris... and being really impressed by the configurability.
I'll figure out how to do tests and make sure visual studio can compile without errors. I might need help if anything is Windows-specific that's needing to be updated, but I'm confident I'll be able to figure it out! (famous last words)
Also, there are some cases where valgrind simply isn't practical to use or doesn't make things easy - it can interfere with CPU detection and cause crashes with common CPU detection routines- "-sigill-diagnostics" and options that exist now were not there until "recently". Not to mention being able to easily deploy a debug build with Duma statically linked ... and what I find to be an easier overall integration into existing build systems.
kduma is surely outdated and probably should be at least documented as to the current state of things.
I should also reorganize the source tree as well, similar to what you've done.
Just an update - I should be pushing out the first overhaul of new documentation built from the original README later tonight
FYI:
- i renamed and archived my previous migration from sourceforge to https://github.com/hayguen/duma_sf
- https://github.com/hayguen/duma is now just a fork of this repo
Great! I'm just getting in so should be getting to work shortly :)
is this work in progress? current README.md doesn't say a word about many things .. amongst, for what duma is good for .. or which document should be read.
by the way: why are there so many source repos?
Still a work in progress, but committed the first run of the mostly automated conversion - I'll try to get the comparison complete soon.
"Lost" a few days of productivity but not entirely wasted.. upgraded the main development system to Fedora 34, migrating some CentOS Stream VMs to AlmaLinux, setting up a better Windows environment with MSVC and playing a bit with the current Windows Detours.
by the way: why are there so many source repos?
automated mirroring process - I've made that clearer in the newly committed markdownified documentation.
Still getting acquainted with the latest pandoc - lots of promising changes there in the current version, addressing some issues I had before.
Apologies for the huge flurry of bot-battling and automation commits - everything is now be merged to master but with some slight verbiage changes to consistency
yes, really a huge amount of mails ;-) don't mind. delete them all :-)
i'd suggest a 'table of contents', linking to the main sections.
despite comparison, think a (sub)section like "errors - not catchable by duma" would be interesting:
- buffer overruns on stack
- read (heap) memory before write
- memory access off by far to much, that access might hit a valid random memory region
- errors in subprocesses
- errors in any non C/C++ programs: programs (scripts) run inside an interpreter
- surely more!
i'd suggest a 'table of contents', linking to the main sections.
I'll generate this.
despite comparison, think a (sub)section like "errors - not catchable by duma" would be interesting
could be a longer list than catchable! :)
It is a good idea, to avoid preventable confusion or misconceptions
yes, really a huge amount of mails ;-) don't mind. delete them all :-)
It's a bit ridiculous but it keeps me consistent, at least as far as styling and updating documentation goes.
current README definitely is on a VERY GOOD way!
in section "Post-mortem (core analysis)" you write "Compile your program (with debugging information), but without DUMA.": think "but without DUMA" looks wrong and superflous
Thanks - I got distracted with the Indy 500 yesterday and forgot about the holiday today so got less done than I wanted yesterday and today will be a bit busy as well but I still hope to finish the comparison section tonight (or tomorrow.)
in section "Post-mortem (core analysis)" you write "Compile your program (with debugging information), but without DUMA.": think "but without DUMA" looks wrong and superflous
noted - I'll fix that section up with the next round of commits.
Looks like I should be able to finish this up tomorrow, finally, after 2 weeks.
I have some construction/renovations going on here which is making things inconvenient, unfortunately.
Everything looks decent except merging comparison, then will close