golang-etw
golang-etw copied to clipboard
Potential performance limitations
Many, many thanks for your work, it is very much appreciated.
I am consuming events from a rather high-throughput provider, Microsoft-Windows-DNSServer, and have noticed some performance issues with that provider and "realtime" mode trace sessions. Once I get above ~8000 ETW events/sec, I begin to see a very significant rise in the number of events lost (viewed through Performance Monitor).
This happens with both golang-etw
-created sessions and with manually created sessions, and both with / without a consumer application.
If I manually create a "Buffered" mode trace session, then I can happily max MS Windows DNS Server out at around 30k DNS requests/sec without losing any events.
Additionally, I have noticed that even with a very simple consumer application (taken from your example, and with the JSON marshalling removed), CPU consumption reaches very high levels (~60-70%) at loads of around 20k ETW events/sec. I'm running this on an AWS EC2 t3.large (2 cores, 8GB), for comparison.
Have you noticed any similar performance, and would you give any consideration to implementing a Buffering mode session trace consumer?
Yes, I did observed performance limitations, and to be honest I thought the limit was much lower :). I am not an ETW expert, but I guess this is due to the way it works. I assume that events get lost whenever the consumer processing is slower than the producer for producing events.
I am maybe wrong but the way I understand "Buffered" trace session from https://learn.microsoft.com/en-us/windows/win32/etw/logging-mode-constants, it does not prevent you losing events. Actually the way I understand it is, you might lose events but you don't know it because events get written in a circular buffer. If the flushing frequency is not high enough events get overwritten in the circular buffer, but I don't think Windows would let you know when that happens. You would have to try it out, like generating a known high numbers of queries (you know will saturate Real Time Session) and count at what you get in the logs at the end. I don't know, maybe that is an experiment you've already run.
So, I would say that the only way to be sure that you can process all events is that you have to make sure that the consumer consumes faster than the producer produces otherwise events gets queued/lost. This conclusion is applicable to any producer/consumer implementation. So I think the only way to fix your issue is to make sure the consumer consumes faster than the producer produces.
All that being said, I don't know what you aim at implementing but this library allows you to filter out events directly in the ETW callback, at different layers of ETW parsing (the earlier you stop parsing the faster you'll get). Another feature is that you can select only some attributes of the events to parse, without the burden of parsing the others. These features (https://pkg.go.dev/github.com/0xrawsec/golang-etw/etw?GOOS=windows#Consumer) allow you to gain a lot in performance. You could for instance implement some caching algorithm not to generate twice an ETW event in the same time window (1 second, 10 second ...). I am pretty sure that you'll be able to find a way to cut down on your logs without losing information. If you could tell me more about your use case I might be able to give you ideas on how to make it work. We can talk in private if you prefer not to post on GH.
Finally, if you don't want to cut down on your logs and you only have to handle the case of bursts of events. You can configure an Autologger (https://learn.microsoft.com/en-us/windows/win32/etw/configuring-and-starting-an-autologger-session), it will dump the logs in a file when the consumer is not fast enough. You can do that with https://pkg.go.dev/github.com/0xrawsec/golang-etw/etw?GOOS=windows#AutoLogger in the library. The only requirements for that is that, you have to set a maximum file size so that you don't saturate your disk if your consumer is always slower that the producer. You can compile https://github.com/0xrawsec/golang-etw/blob/master/tools/etwdump/etwdump.go and use the tool to easily create an autologger session.
Before implementing "Buffered Mode" I need to be sure that it brings additional features to the users.