winrm icon indicating copy to clipboard operation
winrm copied to clipboard

Use proper UTF-8 to UTF-16 conversion?

Open rgl opened this issue 6 years ago • 5 comments

Can https://github.com/masterzen/winrm/blob/1d17eaf15943ca3554cdebb3b1b10aaa543a0b7e/powershell.go#L10-L23 be changed to use a proper UTF-8 to UTF-16 (the native windows encoding) conversion?

rgl avatar Oct 23 '19 17:10 rgl

@rgl,

I'm not really versed into Windows encoding (nor UTF-16). From what I understand this code is building a pure UCS-2 (wide char) string with the topmost byte being always 0. This will indeed fail with any character > 127, which is unfortunate.

I think this can be fixed with this:

 wideCmd := utf16.Encode([]rune(psCmd))

Hopefully the result will be in proper endian for the receiving machine.

Would you mind testing this, as I'm very illiterate about everything related to powershell ?

masterzen avatar Oct 24 '19 08:10 masterzen

Windows uses UTF-16LE and utf16.Encode is UTF-16BE. I will submit PR soon.

rgl avatar Oct 24 '19 17:10 rgl

Oh an I was mistaken, utf16.Encode is really UTF-16LE! We just need to convert the result into a []byte.

rgl avatar Oct 25 '19 18:10 rgl

@rgl I'm lost, your PR implements a BE->LE conversion, but your last comment here seems to imply it wasn't needed. Can you elaborate?

masterzen avatar Oct 27 '19 18:10 masterzen

Sorry for the confusion... the PR does not really convert from BE to LE.

Let me clarify:

  1. utf16.Encode converts from string to a []uint16 encoded as UTF-16LE.
  2. encodeUtf16Le converts from []uint16 to a []byte encoded as UTF-16LE, making the entire conversion from string to []byte encoded as UTF-16LE.

rgl avatar Oct 27 '19 22:10 rgl