HASS.Agent icon indicating copy to clipboard operation
HASS.Agent copied to clipboard

Bug: PowershellSensor as Satellite Service - error_during_execution

Open Skons opened this issue 1 year ago • 47 comments

Describe the bug Powershell sensor does not work as satellite service. It works when Test Command/Script is executed, and it works as a hass.agent sensor.

To Reproduce Steps to reproduce the behavior:

  1. Go to Satellite Service
  2. Click on Sensors
  3. Click on Add New
  4. Select PowerShellSensor
  5. Add this as command $(get-date).ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ss')
  6. Click Store Sensor
  7. Click Send & Active Sensors

Expected behavior The date as state in Home Assistant

Screenshots image image

Misc info (please complete the following information):

  • Windows build (ideally screenshot/info of winver.exe output): Win10 22H2
  • Windows' UI language: EN
  • HASS.Agent version: 2022.14.0

Please check what's applicable (multiple answers possible):

  • [x] Installed via installer
  • [ ] Installed manually
  • [ ] Problem occurs in HASS.Agent
  • [x] Problem occurs in Satellite Service

Skons avatar Aug 15 '23 21:08 Skons

Hello, I cannot reproduce this: obraz

Snippet from satellite sensor config:

  {
    "Type": "PowershellSensor",
    "Id": "67ce197d-bcb0-4a61-a604-a6f276aacd32",
    "FriendlyName": "",
    "UpdateInterval": 30,
    "Query": "$(get-date).ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ss')",
    "Scope": null,
    "WindowName": "",
    "Category": "",
    "Counter": "",
    "Instance": "",
    "Name": "AMADEO-PC-satellite_powershellsensor",
    "ApplyRounding": false,
    "Round": null
  },

amadeo-alex avatar Aug 17 '23 17:08 amadeo-alex

this is mine:

[
  {
    "Type": "PowershellSensor",
    "Id": "3a0ef5bd-22bb-475d-9df2-d7ae48bb5430",
    "UpdateInterval": 30,
    "Query": "$(get-date).ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ss')",
    "Scope": null,
    "WindowName": "",
    "Category": "",
    "Counter": "",
    "Instance": "",
    "Name": "[snip]-satellite_powershellsensor"
  }
]

I noticed i missed ApplyRounding and Round. Stopped the service, edited the json, started the service again. No success. After send and activate the custom config was gone.

The satelite service is 2022.20.0.0. I am running the client from another disk (D:) then the satelite service (C:).

I also tried to change the command to $env:computername, that dit not work either

Skons avatar Aug 31 '23 19:08 Skons

I get the same Error Code executing a Powershell Script that gets the the GPU Power Consumption via some String-Extraction out of nvidia-smi.exe. This works perfectly fine from the Agent or when testing from the Config, but HomeAssistant shows the value as "error_during_execution". image0 image

Other Sensors, such as ProcessActive work without any problem.

Zarox666 avatar Sep 05 '23 22:09 Zarox666

Any ETA on this? Running the PowerShell Sensor as User instead of Service works, but relogging everytime as User is not the idea behind a Server/Service.

Zarox666 avatar Sep 10 '23 16:09 Zarox666

@Zarox666 Hard to give ETA for a fix PR as I'm not able to reproduce this issue. You may try to help with the investigation by posing the script that is not working for you.

@Skons Strange that it removes the round config options, are you using newest available version of HASS.Agent?

amadeo-alex avatar Sep 10 '23 18:09 amadeo-alex

Yes, it is stated in the first post: 2022.14.0.

if you want me to run a debug version, or enable debug logging (if possible), then i am happy to help

Skons avatar Sep 10 '23 19:09 Skons

OK now its gettinig wiered.....

For me the service based sensor on my Server startet to work this morning (without any interaction from my end) At that time the expected value changed from 0 to 5 and later back to 0. The Service Sensor changed from unknown to 5 and later correctly to 0. image

However the Service based sensor on my PC still reports unknown instead of the correct value of 0 image

Here is the Script that is used by both Sensors: (in essence it reads lines out of a .ini file and fills a path variable with that. Then it counts how many running process exist that have their .exe file somewhere under that path and returns that number.) ` #$DebugPreference = "Continue"

#Include

#Variables $MinerInstances = 0 Get-Content $PSScriptRoot\mining.ini | Foreach-Object{ $var = $_.Split('=') New-Variable -Name $var[0] -Value $var[1] -ErrorAction SilentlyContinue }

Write-Debug "#####################################" Write-Debug "MineManagerPath= $MineManagerPath" Write-Debug "#####################################"

Get-ChildItem -Path $MineManagerPath -Filter .exe -Recurse -File -Name| ForEach-Object { $ProcessName = [System.IO.Path]::GetFileNameWithoutExtension($_) if (Get-Process -Name $ProcessName -ErrorAction SilentlyContinue) { $Process = Get-Process -Name $ProcessName $ProcessPath = $Process.Path $ProcessID = $Process.Id Write-Debug "Found running Process $ProcessName out of $ProcessPath with ProcessID $ProcessID" if ($ProcessPath -like $MineManagerPath+"" ) { Write-Debug "Mining-Process found!" $MinerInstances++ } } }

return $MinerInstances ` The other Script that has the same issue retrieves GPU Power consumption via String extraction from an NVIDIA .exe file. Its a lot more complex.

Let me set up a Test-Sensor with a power-shell Script that just returns 0 and see how that fares.

EDIT: And yes, I am fully aware that it now shows "unknown" instead of the original "error_in_execution". I have no explanation for this. I can even pinpoint when this happend: image

Zarox666 avatar Sep 11 '23 08:09 Zarox666

OK,

succesfull Repro, but only on my PC not on my Server (more on that later). image

The Names appended with _service are Service Sensors in HASS, the others are normal Sensors. all 4 sensor are powershell sensors executing a powershell script containing just one line: return 0

So this makes it clear, that its not due to the logic inside the Powershell script, but other factors must be at play here. my Server is a Windows Server 2022 Version 21H2. my PC is a Windows 10 Versiojn 22H2. HASS Agent Version on both 2022.14.0

However we need to depart from the notion, that it never works: (any number including 0 is a valid value) (times where ZaroxOmega goes black is due to my PC beeing switched off) (the problem are the yellow "error_during_execution" and grey "unknown" blocks) last week: image

Zarox666 avatar Sep 11 '23 08:09 Zarox666

I might have made some progress.

My Service Log File on my PC is full with these messages: 2023-09-11 14:32:12.851 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) 2023-09-11 14:32:12.851 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) 2023-09-11 14:32:13.613 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) 2023-09-11 14:32:13.613 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) 2023-09-11 14:32:13.613 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) 2023-09-11 14:32:14.364 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) 2023-09-11 14:32:14.364 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) 2023-09-11 14:32:14.364 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage') System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage') at System.Text.Encoding.GetEncoding(Int32 codepage) at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors) I do not see these Messages in the Services Log on my Server, where the Powershell sensor works.

Digging into the Error Message searching for System.Text.Encoding.GetEncoding I found that "1" is indeed not a valid codepage. See: https://learn.microsoft.com/en-us/dotnet/api/system.text.encoding.getencoding?view=net-7.0 https://learn.microsoft.com/en-us/dotnet/api/system.text.encoding?view=net-7.0 However its not clear to me, why this would behave different between the two Systems. 1 is not a valid code page for either of them, so I can only assume, that HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput might be behaving differently between the two otherwise "identical?" scenarios.

Zarox666 avatar Sep 11 '23 12:09 Zarox666

It could be that the .Net version makes a difference. Mine is 4.8.09032. Nice find!

Edit: Same here

2023-09-11 17:13:22.842 +02:00 [FTL] 1 is not a supported code page. (Parameter 'codepage')
System.ArgumentException: 1 is not a supported code page. (Parameter 'codepage')
   at System.Text.Encoding.GetEncoding(Int32 codepage)
   at HASS.Agent.Shared.Managers.PowershellManager.ExecuteWithOutput(String command, TimeSpan timeout, String& output, String& errors)

Skons avatar Sep 11 '23 15:09 Skons

Good find, thank you both! I'll take a look at what may be causing this.

amadeo-alex avatar Sep 11 '23 16:09 amadeo-alex

From what I've been able to research, it looks like this might be some kind of incompatibility between the .net/.netframework and the keyboard/system languages which are installed on a given system. @Zarox666 could you please check the keyboard/system languages on both of your systems and look for discrepancies? In the meantime I'll create a fallback logic to UTF-16 and put it as part of https://github.com/LAB02-Research/HASS.Agent.Staging/pull/14 which should remediate the issue.

amadeo-alex avatar Sep 11 '23 17:09 amadeo-alex

These are mine image But, as a user i do not have any problem running this. As SYSTEM it is a problem. Maybe there is something else in regards to the SYSTEM "user"?

Skons avatar Sep 11 '23 18:09 Skons

Fix added to the PR and merged to the bulk PR https://github.com/LAB02-Research/HASS.Agent.Staging/pull/24. Unfortunately for now the only reliable way to get it is to build the project yourself :\ (I don't recommend accepting exe files/binaries from me or any other random internet user)

amadeo-alex avatar Sep 11 '23 18:09 amadeo-alex

There is progress, but its not what i expected: image

Skons avatar Sep 11 '23 19:09 Skons

What the...I'm quite certain that is neither English or Dutch :D Can I please have those ?chinese? characters?

amadeo-alex avatar Sep 11 '23 19:09 amadeo-alex

Yeah, using unicode/utf-16 causes "normal" characters to become chinese and using utf-8 causes the chinese characters to be broken. How one cannot love text encoding...

amadeo-alex avatar Sep 11 '23 20:09 amadeo-alex

image In the hass agent, the value is correct

Skons avatar Sep 11 '23 20:09 Skons

Just a side note: i saw that the implementation is done by executing powershell.exe. Are you aware that you could implement powershell c# natively? https://learn.microsoft.com/en-us/powershell/scripting/developer/hosting/host01-sample?view=powershell-7.3

And second: if you'll keep on using the commandline, did you know the command executed by powershell.exe can be base64 encoded? https://devblogs.microsoft.com/scripting/powertip-encode-string-and-execute-with-powershell/

Skons avatar Sep 11 '23 20:09 Skons

image In the hass agent, the value is correct

Ok now I think I'm missing something, if I force the sensor to use utf-16 both the HASS.Agent preview and HA are malformed for me: pfl4Qc oRr9dt

Script: $(get-date).ToUniversalTime().ToString('yyyy-MM-ddTHH:mm:ss')

amadeo-alex avatar Sep 11 '23 20:09 amadeo-alex

I am unable to store the sensor with the latest version. The sensors.json is not created.:

2023-09-11 22:47:52.639 +02:00 [INF] [SENSORSMANAGER] Processing 1 received sensor(s), deleting 0 sensor(s) ..
2023-09-11 22:47:52.663 +02:00 [FTL] [SENSORSMANAGER] Error while storing: PerformanceCounter not found
System.Exception: PerformanceCounter not found
   at HASS.Agent.Shared.HomeAssistant.Sensors.PerformanceCounterSensor..ctor(String categoryName, String counterName, String instanceName, Boolean applyRounding, Nullable`1 round, Nullable`1 updateInterval, String name, String friendlyName, String id) in C:\Users\Kevin\source\repos\HASS.Agent.Staging-bulk-s1\src\HASS.Agent.Staging\HASS.Agent.Shared\HomeAssistant\Sensors\PerformanceCounterSensor.cs:line 34
   at HASS.Agent.Satellite.Service.Settings.StoredSensors.ConvertConfiguredToAbstractSingleValue(ConfiguredSensor sensor) in C:\Users\Kevin\source\repos\HASS.Agent.Staging-bulk-s1\src\HASS.Agent.Staging\HASS.Agent.Satellite.Service\Settings\StoredSensors.cs:line 144
   at HASS.Agent.Satellite.Service.Sensors.SensorsManager.StoreAsync(List`1 sensors, List`1 toBeDeletedSensors) in C:\Users\Kevin\source\repos\HASS.Agent.Staging-bulk-s1\src\HASS.Agent.Staging\HASS.Agent.Satellite.Service\Sensors\SensorsManager.cs:line 266

Skons avatar Sep 11 '23 20:09 Skons

You're using this branch? https://github.com/LAB02-Research/HASS.Agent.Staging/pull/14

amadeo-alex avatar Sep 11 '23 20:09 amadeo-alex

Ok, I've changed it to fallback to utf-8 since I need to drop from the PC and don't want to leave a borked PR hanging around. This should fix the issue partially as all "standard" characters should work. Everything else like chinese/polish characters will not display correctly.

I tried to fiddle with the settings but no matter what/how I configure the process which is launched, the "normal" part of the output is malformed when utf-16 is used. I'll try to tackle this from other side tomorrow. Main issue stays the same, on some systems, for some reason when "CultureInfo.CurrentCulture.TextInfo.OEMCodePage" is called from the service context, it returns 1 instead of proper code page.

amadeo-alex avatar Sep 11 '23 21:09 amadeo-alex

You're using this branch? LAB02-Research/HASS.Agent.Staging#14

Yes

For what its worth, I also found this in regards to the OEMcodePage: https://stackoverflow.com/questions/1812104/relation-between-net-encoding-and-characterset

Skons avatar Sep 11 '23 21:09 Skons

As requested:

There is indeed differences in the language settings. But as Skons has pointed out, this affects only the Service.

Server image PC image

.NET Runtime on the Server is 6.0.3 while on the PC it is 6.0.21

Now let me read up on the remainder of the Thread :-)

Zarox666 avatar Sep 11 '23 21:09 Zarox666

Some thoughts.....

Is it smart to call OEMCodePage? as afaik the OEM in this context would be the hardware manufacturer... and that can vary widely from a laptop over a desktop to a virtual machine. Instead of explicitly setting the CodePage wouldnt it be better to just let the System use its defaults? https://learn.microsoft.com/en-us/dotnet/api/system.globalization.textinfo?view=net-7.0 (See "Remarks")

Sorry if these thoughts are silly... I am just an Escalation Engineer, not a Programmer :-)

Zarox666 avatar Sep 11 '23 21:09 Zarox666

I have small progress with the investigation...so, the main culprit here is that even if the process running powershell has the input/output set to X or Y encoding, the powershell internally will still use system encoding. This causes issues because powershell will output for example "asdäöüąęć" in the current system encoding whatever it might be and process stream reader will interpret it as X or Y encoding that was configuring before running it.

I've found a workaround where "injecting" for example: [Console]::OutputEncoding = [System.Text.Encoding]::Unicode before the script/command itself works properly with Unicode/utf-16 encoding set for the process - this however is not an optimal solution. This main issue persists that for some reason the "CultureInfo.CurrentCulture.TextInfo.OEMCodePage" returns an invalid value of "1" on some system/keyboard/locale configurations...

amadeo-alex avatar Sep 13 '23 18:09 amadeo-alex

This is going to look like a personal notebook but maybe it'll be useful for someone someday :D ~~As @Zarox666 suggested, not setting the input/output/error encoding causes the scrip/command to work properly with a caveat that some charactes come out as "?": asdśćżń1äöü23说/説 becomes asdśćżń1äöü23?/? I'll take a wild guess that is has something to do with number of bytes required to encode a given character.~~ Installing additional display language borked this just because so, fun!

What is interesting is that restoring the code setting "StandardOutputEncoding" to OEMCodePage changes nothing in the output.

I'll rewind the git history of this file trying to find a good reasoning to remove this part of the code but if not, the safest way will be to set the StandardOutputEncoding when it is possible (OEMCodePage returns an actual value) and not touch it otherwise.

amadeo-alex avatar Sep 13 '23 18:09 amadeo-alex

It was added as part of https://lab02research.youtrack.cloud/issue/hassagent-164 due to https://github.com/LAB02-Research/HASS.Agent/issues/201 so it definitely stays. Ignoring the encoding when "OEMCodePage fails" is also not a good solution as it'll end up with "Issue: powershell sensor works in HASS.Agent but not in service" sooner or later.

amadeo-alex avatar Sep 13 '23 19:09 amadeo-alex

@Skons @Zarox666 could you please try this branch - https://github.com/amadeo-alex/HASS.Agent.Staging/tree/test-powershell-sensor-encoding - on the affected machines and then post the logs? (this branch will create the log entries each time the sensor is updated so don't run it too long)

Expected output is something similar to this:

2023-09-13 22:42:46.730 +02:00 [ERR] ENCODING-TEST: currentCulture  {"ANSICodePage":1252,"OEMCodePage":437,"MacCodePage":10000,"EBCDICCodePage":37,"LCID":9,"CultureName":"en","IsReadOnly":false,"ListSeparator":",","IsRightToLeft":false}
2023-09-13 22:42:46.730 +02:00 [ERR] ENCODING-TEST: currentUICulture  {"ANSICodePage":1252,"OEMCodePage":437,"MacCodePage":10000,"EBCDICCodePage":37,"LCID":9,"CultureName":"en","IsReadOnly":false,"ListSeparator":",","IsRightToLeft":false}
2023-09-13 22:42:46.730 +02:00 [ERR] ENCODING-TEST: currentInstalledUICulture  {"ANSICodePage":1250,"OEMCodePage":852,"MacCodePage":10029,"EBCDICCodePage":20880,"LCID":1045,"CultureName":"pl-PL","IsReadOnly":true,"ListSeparator":";","IsRightToLeft":false}
2023-09-13 22:42:46.730 +02:00 [ERR] ENCODING-TEST: invariantCulture  {"ANSICodePage":1250,"OEMCodePage":852,"MacCodePage":10029,"EBCDICCodePage":20880,"LCID":1045,"CultureName":"pl-PL","IsReadOnly":true,"ListSeparator":";","IsRightToLeft":false}

In case something goes wrong (which we expect) there'll be additional ENCODING-TEST: first/second/third/fourth fail

amadeo-alex avatar Sep 13 '23 20:09 amadeo-alex