Export format suitable for direct SIEM imports
hi there,
i was quite happy when i stumbled across this project and attempted to use the csv file provided via the gh pages api directly in a siem query (splunk/sentinel) to build a dll (search oder) hijacking detection. unfortunately i had to make the experience that the format as-is isn't really usable. full paths would be required to facilitate matching against log events. this can of course be done in the siem queries, but would ideally already be provided via the api, so that a simple lookup is enough. i must admit, that i tried and failed at enhancing api/hijacklibs.csv (so no pr, sorry) and due to missing experince in that field can't even tell whether this is easily possible in the first place. i reverted to creating a new repo and using a gh workflow to generate and provision csv files in a suitable format on a schedule (github.com/hRun/HijackLibsExport). it's not perfect and probably will never be, as some variables like %VERSION% can of course not be replaced statically, but it is enough to make implementing the use case in a well-functioning way possible (beating microsoft defender's built-in capabilities :P). i'd be happy if you'd have a shot at checking whether the same functionality/format could be implemented in the gh pages.
cheers, hRun
Hey @hRun , thanks for this feedback - glad to hear this project is of use to you!
The use of placeholders was a design choice, as it:
- avoids redundancy (e.g.
%PROGRAMFILES%instead ofc:\program filesplusc:\program files (x86)); - makes no assumptions about the installation drive (although very rare, your system32 folder may be located in
d:\windows\system32; current placeholders would even allow\\remotehost\c$\windows\system32); - allows to exclude 'mirrored folders' more easily (e.g. many files living in
c:\windows\system32also appear inc:\windows\winsxs).
You can see the placeholder substitutions for Sigma (which requires full, absolute paths) here - you could consider doing a similar replace operation in your SIEM query (e.g. splunk's replace or KQL's replace_string).
Alternatively, you can use a preprocessing step - which is basically what you're doing now.
A script I quickly wrote up is the following, it is doing a similar thing to your python script:
import csv, requests
MAPPING = {'VERSION':['*'], 'PROGRAMFILES':['c:\\program files', 'c:\\program files (x86)'], 'SYSTEM32': ['c:\\windows\\system32', 'c:\\windows\\winsxs', 'c:\\$windows.~bt', 'c:\\windows\\softwaredistribution'], 'SYSWOW64': ['c:\\windows\\syswow64', 'c:\\windows\\winsxs', 'c:\\$windows.~bt', 'c:\\windows\\softwaredistribution'], 'WINDIR':['c:\\windows'], 'PROGRAMDATA':['c:\\programdata'], 'APPDATA':['c:\\users\\*\\appdata\\roaming'], 'LOCALAPPDATA': ['c:\\users\\*\\appdata\\local']}
def replace_placeholders(input_str:str) -> str:
result_list = []
for item in input_str.split(','):
new_items = [item]
for placeholder, replacements in MAPPING.items():
if (placeholder:=f'%{placeholder}%') in item:
new_items = [new_item.replace(placeholder, replacement) for new_item in new_items for replacement in replacements]
result_list.extend(new_items)
return ','.join(result_list)
with open('output.csv', 'w') as f:
r = list(csv.DictReader(requests.get('https://hijacklibs.net/api/hijacklibs.csv').iter_lines(decode_unicode=True)))
w = csv.DictWriter(f, r[0].keys(), delimiter=',', lineterminator='\n')
for entry in r:
entry['ExpectedLocations'], entry['VulnerableExecutablePath'] = replace_placeholders(entry['ExpectedLocations']), replace_placeholders(entry['VulnerableExecutablePath'])
w.writerow(entry)
For now, I'll keep the CSV as it is; however, if there is appetite for a different version of the CSV with absolute/resolved paths, I will consider adding it to the website.
the appetite is certainly there on my/our side. not sure if there is anybody else 🧐 and i can fully understand any decision you make regarding this request.
i certainly wouldn't remove/replace the current csv from the api, as the placeholders are indeed a very good design choice (and probably the only viable one considering user names, drive letters, etc.). if you consider adding a version with resolved paths in the future, i'd only add it as a second option alongside the current one. (thanks for the reference to the sigma substitutions btw, i'll work that into my workflow)
the reason for wanting an already resolved version on github or the api is that it would remove the need for bloated siem queries, macros or local preprocessing. a simple and elegant "| lookup