fsharp icon indicating copy to clipboard operation
fsharp copied to clipboard

Preserve console encoding when using --utf8output switch without swapping console streams

Open majocha opened this issue 1 year ago • 16 comments

Previously the Console.Out stream was remembered and finally restored in case of using --utf8output switch in order to restore the original console encoding.

With this PR, just the original encoding is preserved and then restored, without swapping the whole Console.Out stream. This cooperates better with redirected console.

Tests added to ensure both fsc and fsi restore encoding correctly.

majocha avatar Sep 17 '24 17:09 majocha

:warning: Release notes required, but author opted out

[!WARNING] Author opted out of release notes, check is disabled for this pull request. cc @dotnet/fsharp-team-msft

github-actions[bot] avatar Sep 17 '24 17:09 github-actions[bot]

IIRC the HostedCompiler was relying on stdout redirection, should only be used for some of the legacy test suites (FsharpQA I think). But FSharpQA still runs here, so that keeps working after your change.

T-Gro avatar Sep 17 '24 21:09 T-Gro

It also looks as if unused binding analysis didn't find the unused:
let setOutputStreams execute =

I wonder why.

KevinRansom avatar Sep 18 '24 07:09 KevinRansom

It also looks as if unused binding analysis didn't find the unused: let setOutputStreams execute =

I wonder why.

Might be related to https://github.com/dotnet/fsharp/issues/13849

edgarfgp avatar Sep 18 '24 08:09 edgarfgp

The scenario is this:

  • The host application is using a non-UTF8 Encoding, E.g UTF-7 or UTF-32
  • The host executes the compilation and passes in the '--utf8output' flag
  • This flag changes the Encoding of the Console.Out to UTF which is not what the host app was set to.
  • The purpose of the savedOut is to ensure that these changes caused by our compiler do not adversely impact the hosts.

Thanks @KevinRansom, this is important because it seems in the course of recent changes the test for this got lost.

Also, I'm wondering, maybe just restoring the encoding (if the Out is still the same during dispose) would be less intrusive.

majocha avatar Sep 18 '24 08:09 majocha

I'm frankly confused about this but I suspect that setting OutputEncoding affects the underlying system level stuff and has no effect when console is redirected?

majocha avatar Sep 18 '24 10:09 majocha

Anything interesting from the git blame? (I can also take a look later in the day)

psfinaki avatar Sep 18 '24 10:09 psfinaki

@majocha --- the redirection is irrelevant, since there is a scenario where someone writes a .net app that does work and calls into the compiler directly. For example fantomas, or fsc.exe for that matter, although neither of those apps mess with Encoding.

KevinRansom avatar Sep 18 '24 14:09 KevinRansom

I'll write a test to see what the situation is right now. I understand that from the consumer's pov it's important that fsc writes using the correct encoding according to the given switch and that console encoding is restored afterwards.

majocha avatar Sep 18 '24 15:09 majocha

It works, but yes a test case in the repo would be useful to ensure that it doesn't get inadvertently ripped out in the future.

KevinRansom avatar Sep 18 '24 15:09 KevinRansom

I added some tests for this.

majocha avatar Sep 20 '24 13:09 majocha

Thanks a lot! This week we have some internal stuff going on so our capacities are lower, hopefully we'll get to all the PRs early next week. You're doing amazing job (and inspiring me to do some related job).

psfinaki avatar Sep 20 '24 16:09 psfinaki

No worries, I can imagine this week was quite demanding :)

majocha avatar Sep 20 '24 16:09 majocha

I noticed there was code already that restored the Console Encoding in fsi and fsc. I reshufflled it a bit and it seems it works without swapping the whole console streams.

To clarify, my intention is to avoid mutating the Console.Out and Console.Error by fsc , because in my parallel testing experiment there are multithreaded TextWriters installed there so we can capture output from many tests at once.

majocha avatar Sep 20 '24 16:09 majocha

@KevinRansom please rereview.

psfinaki avatar Oct 14 '24 17:10 psfinaki

  • The host application is using a non-UTF8 Encoding, E.g UTF-7 or UTF-32

UTF-32 does not seem to be supported on my machine at all, but I tried UTF7 with Console.OutputEncoding <- Text.Encoding.GetEncoding(65000) and it gives a runtime exception in dotnet now: image

This made me think, what if the user somehow does have UTF7 set? Will we be able to restore from UTF8 to UTF7 without issues?

I made a small program to test it

open System
open System.Runtime.InteropServices

[<DllImport("kernel32.dll", SetLastError = true)>]
extern bool SetConsoleOutputCP(uint32 wCodePageID)

[<EntryPoint>]
let main argv =
    if SetConsoleOutputCP(65000u) then
        printfn $"Original output encoding is {Console.OutputEncoding.EncodingName}"
        let encoding = Console.OutputEncoding
        Console.OutputEncoding <- Text.Encoding.UTF8
        printfn $"set to {Console.OutputEncoding}"
        Console.OutputEncoding <- encoding
        printfn $"set back to {Console.OutputEncoding.EncodingName}"
    else
        printfn "Failed to set console output code page."
    0

and thankfully it does work. It seems any code page saved from the console can be restored, even if it cannot be created in dotnet.

Maybe this could be added to the tests as Windows only.

majocha avatar Oct 19 '24 19:10 majocha