gnucobol icon indicating copy to clipboard operation
gnucobol copied to clipboard

Add a feature to build a binary debug database at compile time

Open ddeclerck opened this issue 1 year ago • 8 comments

Disclaimer: this is still a work in progress

This PR adds the ability to collect and dump various information useful for debugging, in particular:

  • a mapping from COBOL locations to C locations
  • information about the different storage of each embedded program (name of COBOL field and attributes, corresponding C variable and offset, etc...)

This information can then used either by a tool that drives gdb or by a gdb extension to improve the debugging experience. We plan to use it to improve debugging in our VSCode extension.

ddeclerck avatar Jan 10 '24 17:01 ddeclerck

Codecov Report

Attention: 174 lines in your changes are missing coverage. Please review.

Comparison is base (c0d64ad) 65.74% compared to head (b7ff8a3) 65.55%.

Files Patch % Lines
cobc/debuggen.c 0.00% 163 Missing :warning:
cobc/codegen.c 0.00% 4 Missing and 4 partials :warning:
cobc/cobc.c 0.00% 2 Missing and 1 partial :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@                  Coverage Diff                  @@
##           gcos4gnucobol-3.x     #130      +/-   ##
=====================================================
- Coverage              65.74%   65.55%   -0.20%     
=====================================================
  Files                     32       33       +1     
  Lines                  59092    59267     +175     
  Branches               15575    15599      +24     
=====================================================
+ Hits                   38849    38851       +2     
- Misses                 14262    14430     +168     
- Partials                5981     5986       +5     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jan 10 '24 17:01 codecov-commenter

FAT WARNING: we do have COBC_GEN_DUMP_COMMENTS which write that data in GC 3.1+ (used by cobcd) and a binary debug tree that can be generated with GC4 (attached to the module in question). There has to be a very compelling reason to add something else, instead I'd highly suggest to inspect both other approaches.

GitMensch avatar Jan 10 '24 18:01 GitMensch

a binary debug tree that can be generated with GC4

Are you referring to the cb_tree_print function in cobc/debug.c ?

ddeclerck avatar Jan 11 '24 08:01 ddeclerck

I'd have to check what this is about (and most important where it comes from), I'm referencing the generated cob_symbol structure https://github.com/OCamlPro/gnucobol/blob/dc0a51e58eaabb72e3d6f3f7931329c9e1469c98/libcob/common.h#L1204-L1230 which also could be extended as needed.

The idea of that was that you don't need any magic C translation at all, because each cob_module with COBOL debug symbols has an entry for each variable and you can directly use those instead. This also works fine in a core-dump which preserves the data (and if the data is not preserved, then you don't need variable symbols either).

And additional to a possible debugger this is also be used for "runtime based" dumping in case of errors, which removes a huge portion of generated code from the C modules. The only "con" is that you cannot have the debug symbols in an external file so only have it loaded when needed (if one separates this then a magic number needs to be inserted into the module, allowing a debugger to check if the symbol file matches the compile and at least output a warning / do additional checks before accessing the data).

GitMensch avatar Jan 11 '24 08:01 GitMensch

That could fit our needs.

While I prefer having the debug info in a separate file (similarly to MSVC's .pdb files), I can accomodate to embedded debug info (after all, that's how gcc behaves by default).

However (correct me if I'm wrong), the current GC4 debug info feature does not produce a map of C/COBOL locations, right ? That could be a valuable addition : I find #line directives slightly unpractical (although it is their purpose...), plus, you don't always have the generated C file at hand.

Also, is there any plan to backport this feature in the 3.x branch ?

ddeclerck avatar Jan 11 '24 09:01 ddeclerck

the current GC4 debug info feature does not produce a map of C/COBOL locations, right

it doesn't need, because the COBOL line numbers are visible to GCC already (currently in GC4 when compiling with -G (uppercase) you likely have COBOL only in - but this may get removed). because of the #line directives you don't need the C sources.

GitMensch avatar Jan 11 '24 10:01 GitMensch

it doesn't need, because the COBOL line numbers are visible to GCC already (currently in GC4 when compiling with -G (uppercase) you likely have COBOL only in - but this may get removed). because of the #line directives you don't need the C sources.

Now that's awkward. Last time I checked (quite a long time ago, on my previous computer, with an outdated Linux distro), I could't get gdb to place a breakpoint in a COBOL file (eg b file.cob:42) despite the #line directives, and so I assumed that was the normal behavior (not like it's very documented). But I tried again right now and it "just works". Good.

Now, we'd just need that on the 3.x branch.

ddeclerck avatar Jan 11 '24 10:01 ddeclerck

If you really want that badly, then I'm ok with a merge request.

To do so you'd first check which commits adjusted the code related to symbols (via git blame if the import is good, otherwise via svn blame), then check those revisions for unrelated changes, if there aren't any you can locally test that with an svn merge, this may result inn 98% working code. Otherwise you'd need to note down unrelated changes, then do an svn diff and revert those unrelated parts, still we'd like to know the svn revision numbers.

Important: the removal of COB_GEN_DUMP_COMMENTS code is not possible to not break cobcd; only the "real dump" code would not be used (= the old code only generated as a comment, the new code additional to generate the new structure); the old dump code would have to stay in libcob (for modules generated with GC 3+) and then new code for dumping be included additional.

Depending on how familiar you are with the process I'd guess this to be a work of 1-4 working days; but if done that way, it would be fine to be integrated into GC 3.x.

GitMensch avatar Jan 11 '24 11:01 GitMensch