Speed up EVTX Parsing
Move over to https://github.com/omerbenamram/pyevtx-rs
@yampelo let me know if you need a hand with this :)
@omerbenamram It's mainly a question of do i change the output of your tool to match what i was working off of before, or do i change all the functions to match the output of your tool. For example:
proc = SysMonProc(
host=event["Computer"],
user=event["EventData_User"],
process_guid=event["EventData_ProcessGuid"],
process_id=int(event["EventData_ProcessId"]),
process_image=process_image,
process_image_path=process_path,
)
proc_file = proc.get_file_node()
proc_file.file_of[proc]
dest_addr = IPAddress(ip_address=event["EventData_DestinationIp"])
proc.connected_to[dest_addr].append(
timestamp=event["EventData_UtcTime"],
port=event["EventData_DestinationPort"],
protocol=event["EventData_Protocol"],
)
if event.get("EventData_DestinationHostname"):
hostname = Domain(event["EventData_DestinationHostname"])
hostname.resolves_to[dest_addr].append(timestamp=event["EventData_UtcTime"])
return (proc, proc_file, dest_addr, hostname)
return (proc, proc_file, dest_addr)
Works off of this: https://github.com/yampelo/beagle/blob/master/beagle/datasources/win_evtx.py#L58
@yampelo The nice thing is that my package already produces valid JSON in rust, so most of the code that is currently here https://github.com/yampelo/beagle/blob/master/beagle/datasources/win_evtx.py#L78 will go away (replaced with json.loads).
As for these snippets - to be compatible with my output, it's merely changing event["EventData_UtcTime"] to event["EventData"]["UtcTime"] (which is the way they are actually represented in the event), but you could also adapt the json output to be flat to match the current code, I think the former option is slightly nicer but both should do the trick.
You could use a snippet that flattens the data (eg https://stackoverflow.com/questions/6027558/flatten-nested-dictionaries-compressing-keys) to basically make this drop in.
So it's really up to you :) But if I could help in any ways id be willing to see this go through, you'd be very surprised with the performance difference if you haven't tried this already.