beats icon indicating copy to clipboard operation
beats copied to clipboard

[Winlogbeat] Microsoft-Windows-Windows Defender/Operational - The specified channel could not be found.

Open nicpenning opened this issue 2 years ago • 9 comments

From what I have observed, it seems that Winlogbeat is having intermittent issues trying to read the "Microsoft-Windows-Windows Defender/Operational" channel. I don't think this is a fault of Winlogbeat but a bug in Windows. However, there may be opportunity for Winlogbeat to gracefully recover and ingest events more robustly.

The first screenshot is when things are working normally and Winlogbeat can read events.

image

The second screenshot is when a new event is created and the events change to having the errors and this is when Winlogbeat can't read them.

image

The only log I see from Winlogbeat without turning on debugging I see this: 2022-02-03T09:10:20.255-0600 WARN eventlog/wineventlog.go:316 WinEventLog[Microsoft-Windows-Windows Defender/Operational] EventHandles returned error The specified channel could not be found. 2022-02-03T09:10:20.259-0600 WARN [winlogbeat] beater/eventlogger.go:167 Read() error. {"id": "Microsoft-Windows-Windows Defender/Operational", "error": "The specified channel could not be found."} What is strange is that when I close event viewer and reopen it the error message goes away. However, Winlogbeat won't be able to read from this channel until I stop and restart the service. So whatever is causing the channel to produce that error it is as if Winlogbeat doesn't try and hook into reading the events again. I am not sure if this is something that can be resolved with Winlogbeat or not. Is it possible Winlogbeat can have better error handling with Event Channels when they have these types of issues? The errors in the Event Viewer on all events are: The description for Event ID 5007 from source Microsoft-Windows-Windows Defender cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer. The publisher has been disabled and its resource is not available. This usually occurs when the publisher is in the process of being uninstalled or upgraded

For confirmed bugs, please report:

  • Version: 7.16.2 - The entire stack, including Winlogbeat
  • Operating System: Windows 10 20H2 19042.1466
  • Discuss Forum URL: https://elasticstack.slack.com/archives/CNEDGGJQ3/p1643902942786959
  • Steps to Reproduce:
  1. Open event viewer and navigate to Microsoft-Windows-Windows Defender/Operational
  2. See that the logs are working just fine.
  3. Generate and event in Defender such as enabling Period Scanning
  4. Give it 30 seconds or so and the errors will popup on every log.
  5. Close event viewer and reopen
  6. Back to step 1 where the logs look normal and you don't have the publisher error. The side affect here is that Winlogbeat will no longer read from that channel because it has this error and doesn't recover and try to read from the channel again until we restart the Winlogbeat service or the computer. I believe something deeper in the Windows OS is causing this problem since we don't see this on other event channels.

Raw Windows Event Log Text:

The description for Event ID 5007 from source Microsoft-Windows-Windows Defender cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer. If the event originated on another computer, the display information had to be saved with the event. The following information was included with the event: 

Microsoft Defender Antivirus 
4.18.2201.8 
HKLM\SOFTWARE\Microsoft\Windows Defender\MpEngine\MpCampRing = 0x4 

The publisher has been disabled and its resource is not available. This usually occurs when the publisher is in the process of being uninstalled or upgraded

nicpenning avatar Feb 03 '22 23:02 nicpenning

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

elasticmachine avatar Feb 07 '22 12:02 elasticmachine

After doing some debugging and working with Microsoft, this was the response:

Based on the information you just provided, we are not seeing any issues in the defender event logs.  It sounds like this may be an issue with the 3rd party tool (Winlogbeat) you are using not recovering gracefully as it should.  I would suggest contacting their support for assistance.

nicpenning avatar Feb 08 '22 17:02 nicpenning

In this case, Winlogbeat has successfully opened the event channel. After a while, reading events from it fails with error code 15007 (ERROR_EVT_CHANNEL_NOT_FOUND): "The specified channel could not be found. Check channel configuration.".

This is the same error that we will get on opening a channel that does not exist (for example due to a typo in the channel name).

If we were to take this event as a transient failure and try to re-subscribe to the channel, the subsequent open will also fail with the same error. The only way to recover gracefully from this error implies not terminating Winlogbeat when such error is encountered on Open() and keep trying to subscribe, which will also prevent it from terminating in the more common case of a channel not existing in the system (due to a typo in the name or other non-transient misconfiguration).

This is a significant refactor to how Winlogbeat works. Any opinions @elastic/security-external-integrations ?

adriansr avatar Mar 29 '22 14:03 adriansr

keep trying to subscribe, which will also prevent it from terminating in the more common case of a channel not existing in the system

IIRC Winlogbeat only terminates if none of the configured channels are valid. Put another way, if one channel is invalid it logs an error and continues reading the others that exist. This happens commonly with the Sysmon channel that is present in the default config, but is not installed by default in Windows.

In the Sysmon case, this new behavior could be advantageous in that when the user installs Sysmon the already running Winlogbeat we start reading it.

In the use case of .evtx reading, I don't think the retry behavior would be desirable. You want that hard failure immediately.

Does the test config command give any feedback on a channels existence? I can't remember how that's implemented.

What about some kind of middle ground in that on the first run of the reader it does not retry on ERROR_EVT_CHANNEL_NOT_FOUND errors? But if the channel was successfully being read and then we encounter ERROR_EVT_CHANNEL_NOT_FOUND it will retry. I want to be cognizant of the code complexity. So if this is going to make the code very complicated then it might not be a good solution.

andrewkroh avatar Mar 29 '22 14:03 andrewkroh

I've used the approach where you try to open X times with a Y delay between then in cases like this.

pseudo code:

func myOpen (name string) (*Channel, error) {
	var retries = 3
	var delay = 1
	var handle *Channel
	var err error

	for i := 0; i < retries; i++ {
		handle, err = Open(name)
		if err == nil {
			break
		}
		time.Sleep(delay)
	}
	
	return handle, err
}

leehinman avatar Mar 29 '22 14:03 leehinman

I'm having similar issue, on some workstations in filebeat logs I see: WinEventLog[winlog-windows.sysmon_operational-ad941318-fa56-47cc-98e6-4546b62c4995] EventHandles returned error The specified channel could not be found.

kowalczyk-p avatar Sep 29 '22 10:09 kowalczyk-p

@kowalczyk-p What version of Filebeat?

Do you have Sysmon installed? If you don't then this is the expected behavior.

andrewkroh avatar Sep 29 '22 14:09 andrewkroh

@andrewkroh yes. I'm currently trying to migrate from Windows Event Forwarding to Elastic Agent. From hosts with above error I receive events generated by sysmon via Windows Event Forwarding.

kowalczyk-p avatar Sep 30 '22 06:09 kowalczyk-p

Elastic Agent version is 8.2.0, I assume Filebeat version is the same.

kowalczyk-p avatar Sep 30 '22 09:09 kowalczyk-p