volatility3 icon indicating copy to clipboard operation
volatility3 copied to clipboard

pdbconv improvments

Open paulkermann opened this issue 3 years ago • 5 comments

Only after I finished this PR I saw this but either way, I added support for parsing some C++ types and a fixed a small bug that could cause fields in a struct to be skipped.

I will attach the source pdb test and the result ISF so you can try for yourself. input.zip

Looking forward to hearing your feedback.

paulkermann avatar Feb 16 '22 11:02 paulkermann

Any more work needs to be done?

paulkermann avatar Feb 17 '22 09:02 paulkermann

@ikelos any progress on this?

paulkermann avatar Mar 24 '22 07:03 paulkermann

From what I understood the main conflict in this PR is the handling of C++ symbols. IE what should we do when there are overloaded functions(global variables can't really be overloaded without a different namespace?). If we want to do 'real' demangeling then this code is a good reference: https://github.com/moyix/pdbparse/blob/b5b61793e4a457c43a5aef7a0499b959826b2c04/src/undname.c#L1241

It looks like windbg u command only accepts demangeled names. Also, if a symbol (after demangeling) appears several twice then the first one is used(the first one found I think). I think that doing whatever windbg does (and at first defining it) is the right course of action. With the apis described below we remain versatile with those changes:

  1. get_symbol would by default return the first thing it found (main use case used by people)(the thing we have right now).
  2. get_exact_symbol would return a symbol based on the mangled name.

On duplicated symbols we keep the first one found as the "main" one which would be returned from get_symbol. To get a specific symbol one would need to use get_exact_symbol. It is the developer responsibility to see if the symbol he accesses has several overloads and which symbol getter he needs to use.

In this link you can see the function RtlStringCbCopyUnicodeString is present twice (one C++ definition and one C definition). I suggest that the C++ one (0x1403057B0) would be returned from get_symbol because it has the lower address. To get the other one you would need to use get_exact_symbol.

paulkermann avatar May 24 '22 10:05 paulkermann

Hey, any update on this? @ikelos

paulkermann avatar May 18 '23 08:05 paulkermann

I'm really sorry, I'm still mulling over how best to handle referencing and interacting with C++ types and I haven't been able to spend time just sitting and thinking through the implications of this. Unfortunately this isn't a high priority so the scare time I spend on volatility is directed to other issues. Do please leave the issue open, but bear with us as we want to make sure we completely consider the implications and most maintainable means of supporting new features like this down the road...

ikelos avatar May 21 '23 19:05 ikelos