LIEF icon indicating copy to clipboard operation
LIEF copied to clipboard

PE: Add GUID property in lief.PE.CodeView

Open Wenzel opened this issue 5 years ago • 5 comments

Is your feature request related to a problem? Please describe. the GUID is a combination of the multiple parts of the code view signature as well as the age. LIEF could provide a function to compute the GUID

I would like from LIEF to propose me a property/method to access the GUID value directly from the CodeView object

Describe the solution you'd like A new property/method in the code view object that would compute and expose the GUID

Describe alternatives you've considered no

Additional context You can find an example of printing the GUID here: https://github.com/libvmi/libvmi/blob/master/examples/win-guid.c#L282

Thanks !

Wenzel avatar Oct 29 '20 01:10 Wenzel

Do you have and example of a PE file and the expected output ?

romainthomas avatar Nov 04 '20 09:11 romainthomas

Hi @romainthomas ,

you can use any PE that has a code view debug directory: ntoskrnl.exe: https://www.dropbox.com/s/7wvx6vaqqefl2f1/ntoskrnl.exe?dl=0

A dumb script I made with LIEF to print the GUID:

#!/usr/bin/env python3

"""
Usage: print_pe_info.py [options] <pe_path>

Options:
    -h --help                       Display this message
    -d --debug                      Enable debug output
"""


import lief
from binascii import hexlify
from docopt import docopt


def main(args):
    pe_path = args['<pe_path>']
    pe = lief.parse(pe_path)
    for debug_dir in pe.debug:
        if debug_dir.has_code_view:
            code_view = debug_dir.code_view
            print(f'Age: {code_view.age}')
            print(f'Signature: {code_view.signature}')

            part1_bin = code_view.signature[:4]
            part1_bin.reverse()
            part1 = bytearray(part1_bin)
            part2_bin = code_view.signature[4:6]
            part2_bin.reverse()
            part2 = bytearray(part2_bin)
            part3_bin = code_view.signature[6:8]
            part3_bin.reverse()
            part3 = bytearray(part3_bin)
            part4_bin = code_view.signature[8:]   
            part4 = bytearray(part4_bin)

            guid = f'{hexlify(part1).decode()}{hexlify(part2).decode()}{hexlify(part3).decode()}{hexlify(part4).decode()}{code_view.age & 0xf}'
            print(f'GUID: {guid}')
            print(f'Filename: {code_view.filename}')


args = docopt(__doc__)
main(args)

This is the result

Age: 2
Signature: [142, 241, 172, 86, 102, 12, 64, 78, 159, 167, 81, 113, 196, 233, 139, 219]
GUID: 56acf18e0c664e409fa75171c4e98bdb2
Filename: ntkrnlmp.pdb

You can also check volatility3's code: https://github.com/volatilityfoundation/volatility3/blob/master/volatility/framework/symbols/windows/pdb.py#L102

It would be very convenient to have an API exposing the GUID directly

Wenzel avatar Nov 04 '20 21:11 Wenzel

@Wenzel here is the working branch: enhancement/pe-guid-480 and the current implementation:

https://github.com/lief-project/LIEF/blob/ac283c2efa7d838e7f2b62136f54898c72758823/src/PE/CodeViewPDB.cpp#L92-L99

I have some doubt about:

  • Does the age is a part of the GUID? (in Volatility it is not mentioned)
  • Does the chunks are zero-padded?

romainthomas avatar Nov 14 '20 08:11 romainthomas

Hey @romainthomas ,

thanks for implementing this.

Does the age is a part of the GUID? (in Volatility it is not mentioned)

good question, and I believe now the age is separated from the GUID. The "final" GUID string includes the age, because it's being used as part of the URL to retrieve the PDB file, but that's it. So, I wouldn't include the age as part of the GUID field.

Does the chunks are zero-padded?

I believe the GUID always has the same size, but you can fill with zeroes if some numbers are missing.

Regarding your implementation, I'm not finding the same GUID as I had with my python script:

In [9]: pe = lief.parse('/home/wenzel/local/test_checksec/pe/ntkrnlpa.exe')                                                                                                                                  

In [10]: pe.debug[0].code_view                                                                                                                                                                               
Out[10]: <lief.PE.CodeViewPDB at 0x7f85be314870>

In [11]: pe.debug[0].code_view.filename                                                                                                                                                                      
Out[11]: 'ntkrpamp.pdb'

In [12]: pe.debug[0].code_view.guid                                                                                                                                                                          
Out[12]: '820f24e0a1e97b8f95f975708aa44f41'

In [13]: pe.debug[0].code_view.age                                                                                                                                                                           
Out[13]: 1

In [14]:  

My script output:

Age: 1
Signature: [149, 249, 117, 112, 138, 164, 79, 65, 143, 123, 233, 161, 224, 36, 15, 130]
GUID: 7075f995a48a414f8f7be9a1e0240f821
Filename: ntkrpamp.pdb

Removing the age part: 820f24e0a1e97b8f95f975708aa44f41 vs 7075f995a48a414f8f7be9a1e0240f82

A good way to test if the GUID is valid is to use volatility3 's pdbconv.py to download the PDB:

$ python3 volatility/framework/symbols/windows/pdbconv.py -o winxp.json -p ntkrpamp.pdb -g 7075f995a48a414f8f7be9a1e0240f821
$ du winxp.json
2,0M	winxp.json

Thanks !

Wenzel avatar Nov 14 '20 12:11 Wenzel

When comparing with results on VT, looks like age is not included as part of the GUID.

ee6ed35568c43fbb5fd510bc863742216bba54146c6ab5f17d9bfd6eacd0f796

Screenshot from 2021-10-17 19-10-02

xorhex avatar Oct 17 '21 17:10 xorhex