telegraf
telegraf copied to clipboard
Generating the Telegraf config using PowerShell on Windows results in a vague error
Relevant telegraf.conf
Complete
Logs from Telegraf
2022-08-03T17:48:40Z E! [telegraf] Error running agent: Error loading config file telegraf.conf: Error parsing data: line 1: invalid TOML syntax
System info
Windows 10 21H2, Powershell 5.1.19041.1682 (the default), Telegraf 1.23.3
Docker
No response
Steps to reproduce
-
.\telegraf.exe config > telegraf.conf
-
.\telegraf.exe --config telegraf.conf
Expected behavior
Output like:
2022-08-03T17:49:16Z I! Starting Telegraf 1.23.3
etc etc etc etc
Actual behavior
See the logs from Telegraf section, a vague error.
Additional info
The error is caused by the default output encoding of PowerShell 5 being UTF-16 LE BOM
instead of UTF-8
. In PowerShell 7.2.5 this has been changes to UTF-8
. Since PowerShell 5 is going to be the default for a while, maybe we should add a PowerShell specific command to the readme?
.\telegraf.exe config | Out-File -FilePath telegraf.conf -Encoding utf8
The above results in a config with encoding UTF-8-BOM
. To get true UTF-8
, the encoding needs to be set to oem
.
Additionally some weird characters show up when using the default command:
## Valid time units are "ns", "us" (or "┬Ás"), "ms", "s".
## Max string error size
## This collect thread counts metrics.
@R290,
Is this the first time you have noticed this, or has this always been this way? I want to understand if we made a change that caused this or no one had noticed before today?
In either case, the next steps for us would be good to update the docs at the very least.
It's been this way since I started using Telegraf on Windows about a year ago. So this would have been around release v1.18 or v1.19.
This issue is actually a duplicate of earlier issues: #4880 #1378 #6662
Should we close this one?
Hi,
Wow, thanks for finding the various history on this.
As you mention PowerShell 7, and even PowerShell 6, both use BOM-less UTF-8 by default. The issue occurs when using PowerShell 5.
This is not an issue when using Command Prompt or the Git-Bash shell.
In Telegraf, the config subcommand is calling fmt.Print[f]
, nothing special is happening. This is specific to PowerShell's odd choice of encoding and while it requires a little more work on the user, is overcome versus adding special cases and detection for a particular version and encoding in PowerShell to Telegraf
I think that an update the docs is the best way forward and have put up #11662. Could you comment if that is sufficient documentation?
Thanks!
ha - I see we put up PRs within minutes of each other.
The reason I did not go down that road of adding support for UTF-16LE encoding is the TOML spec specifies the requirement for a valid UTF-8 encoded document. I am unsure what dragons we are opening ourselves up to if we start accepting the additional encoding.
Seems we both think it's an important issue!
I actually wrote something similar before I went down the encoding rabbit hole: https://github.com/R290/telegraf/commit/475051d99acbc6c6cffb108280ae83b6fbaa2ef7. Your version seems to do a bit more explaining, whereas mine is rather brief, so let's go with yours!
I understand your concerns about accepting additional encoding. In this case it would just be UTF16-LE, the Windows standard. I'll some test cases for UTF8 with an actual error on line 1 and UTF16LE with an actual error on line 1 and will leave the PR at that.
I actually wrote something similar before I went down the encoding rabbit hole: https://github.com/R290/telegraf/commit/475051d99acbc6c6cffb108280ae83b6fbaa2ef7. Your version seems to do a bit more explaining, whereas mine is rather brief, so let's go with yours!
Again, thank you for reporting this and digging up the history. I will go merge my PR and close yours in a bit.