fsharp icon indicating copy to clipboard operation
fsharp copied to clipboard

RFC FS-1015 - print and println functions

Open albert-du opened this issue 2 years ago • 12 comments

implements RFC FS-1125, fsharp/fslang-suggestions#1092

Adds print and println functions with examples.

albert-du avatar Jul 29 '22 22:07 albert-du

Since these functions would likely be the first functions beginners encounter, the XML documentation should either avoid use of technical jargon such as stdout or explain what that is. Also, should tests be included with these two functions?

Happypig375 avatar Jul 31 '22 08:07 Happypig375

I changed the xml docs to remove references to stdout

albert-du avatar Aug 01 '22 06:08 albert-du

Since these functions would likely be the first functions beginners encounter, the XML documentation should either avoid use of technical jargon such as stdout or explain what that is. Also, should tests be included with these two functions?

Note that printf and printfn already do mention stdout, while eprintf and eprintfn mention stderr—perhaps it's better to maintain consistency with these and use stdout for the new functions as well instead of switching to console.

brianrourkeboll avatar Aug 01 '22 19:08 brianrourkeboll

Since these functions would likely be the first functions beginners encounter, the XML documentation should either avoid use of technical jargon such as stdout or explain what that is. Also, should tests be included with these two functions?

Note that printf and printfn already do mention stdout, while eprintf and eprintfn mention stderr—perhaps it's better to maintain consistency with these and use stdout for the new functions as well instead of switching to console.

In that case, an explanation of what stdout is would be desirable.

Happypig375 avatar Aug 02 '22 06:08 Happypig375

@albert-du, I believe it is customary to mention the RFC also in the title of the PR (something like: RFC 1015: implementing print and println), this way, once it gets merged, it will be directly clear from the log what was implemented. Great work btw!

abelbraaksma avatar Aug 03 '22 23:08 abelbraaksma

@abelbraaksma Thank you!

albert-du avatar Aug 04 '22 06:08 albert-du

I adjusted the wording on the xml docs to add a remark describing stdout, it may need more depth but should suffice for a basic explanation.

albert-du avatar Aug 04 '22 07:08 albert-du

Remarks won't be visible in Visual Studio tooltips though.

Happypig375 avatar Aug 04 '22 07:08 Happypig375

Remarks won't be visible in Visual Studio tooltips though.

Did not know that, @Happypig375 any idea how it should be done?

albert-du avatar Aug 04 '22 07:08 albert-du

Probably as part of summary.

Happypig375 avatar Aug 04 '22 07:08 Happypig375

That looks good to me.

One thought is that maybe in some future PR we can unify it to use printf, and have the latter to accept both plain string and format (via the when guards with type matches?).

Cc @KevinRansom @dsyme

vzarytovskii avatar Aug 08 '22 18:08 vzarytovskii

@vzarytovskii Currently let a = printf infers a function taking PrintfFormat and I don't see it changing.

Happypig375 avatar Aug 09 '22 06:08 Happypig375

@KevinRansom One problem of that would be being easier to fall into the trap of globalization variances. image

Happypig375 avatar Aug 14 '22 07:08 Happypig375

@KevinRansom One problem of that would be being easier to fall into the trap of globalization variances. image

This is what I see:, it looks like it is obeying international rules pretty much how I would expect:, am I missing something?

image

And this is fsi

C:\Program Files\Microsoft Visual Studio\2022\Community>fsi

Microsoft (R) F# Interactive version 12.0.4.0 for F# 6.0
Copyright (c) Microsoft Corporation. All Rights Reserved.

For help type #help;;

> open System
- open System.Globalization
-
- let println<'T> (v:'T) =
-     Console.Out.WriteLine(v)
-
-
- let date = new DateTime(2000, 1, 2)
- let number = 12345.6789
-
- System.Threading.Thread.CurrentThread.CurrentCulture <- new CultureInfo("de-DE")
- println "FSharp - DE"
- println date                                        // 02.01.2000 00:00:00
- println number                                      // 12.345,68 ?
-
- println "FSharp - US"
- System.Threading.Thread.CurrentThread.CurrentCulture <- new CultureInfo("us-US")
- println date                                        // 02.01.2000 00:00:00
- println number                                      // 12.345,68 ?
- ;;
FSharp - DE
02.01.2000 00:00:00
12345,6789
FSharp - US
01/02/2000 00:00:00
12345.6789
val println: v: 'T -> unit
val date: System.DateTime = 01/02/2000 00:00:00
val number: float = 12345.6789
val it: unit = ()
>

KevinRansom avatar Aug 14 '22 10:08 KevinRansom

@KevinRansom That's what I mean: globalization will be implicit and will be unexpected for beginners, as that is not how corresponding code is written.

Happypig375 avatar Aug 14 '22 10:08 Happypig375

@Happypig375, as long as internally the string version is used, everything will be output without globalisation, as that uses CultureInfo.InvariantCulture. In other words, that can be used to fix these overloads if needed.

This principle of least surprise and idempotency regardless where your code runs is, I believe, one of the underpinnings of F#. So, if we use @KevinRansom’s suggestion, I think that’s the way to go.

This would also give better predictability, as we won’t rely on any changes to Console.WriteLine.

abelbraaksma avatar Aug 14 '22 12:08 abelbraaksma

I may have lost the thread somehow. Are we suggesting that the globalization mechanisms built into the CLR are wrong and should be avoided?

KevinRansom avatar Aug 14 '22 19:08 KevinRansom

@KevinRansom That's what I mean: globalization will be implicit and will be unexpected for beginners, as that is not how corresponding code is written.

I am actually a bit confused, isn't local culture something we actually want to use for ordinary print(ln)?

vzarytovskii avatar Aug 14 '22 19:08 vzarytovskii

Are we suggesting that the globalization mechanisms built into the CLR are wrong and should be avoided?

@KevinRansom

No to: wrong in CLR. They are what they are :). Yes to: in code where we convert datatype X to a string, there's a long-lasting tradition in FSharp.Core to use InvariantCulture for all types that support a culture (i.e., implement IFormattable).

In other words, I'm saying that 1.23.ToString() prints 1,23 on my machine, but string 1.23 prints 1.23 (dot vs comma). We should stick with this approach. Calling println 1.23 should behave the same as print (stringln 1.23).

So, in the implementation this would be something like string x |> Console.WriteLine.

Here's the funny bit: it looks like string interpolation slightly deviated from that plan already, which is surprising (it may even be a bug, can't find it in the docs). See:

> open System.Globalization;;
> open System.Threading;;
> Thread.CurrentThread.CurrentCulture <- new CultureInfo("de-DE");;
val it: unit = ()

> 1.23.ToString();;
val it: string = "1,23"

> string 1.25;;
val it: string = "1.25"

> sprintf "%f" 1.25;;
val it: string = "1.250000"

> sprintf $"{1.25}";;
val it: string = "1,25"   // oh oh!!

> sprintf $"%f{1.25}";;
val it: string = "1.250000"

abelbraaksma avatar Aug 14 '22 19:08 abelbraaksma

Found it. The difference is subtle. But without type-specifier, whatever is inside the holes, is considered an obj and just ToString() is called on it (this is implied, but not explicitly stated in the RFC). You can see this when you cast it as IFormattable or FormattableString:

> ($"The speed of light is {speedOfLight:N3} km/s." : FormattableString).GetArguments();;
val it: obj[] = [|299792.458|]

I don't think that should apply here, so my vote is still for "same behavior as string" for this new function.

abelbraaksma avatar Aug 14 '22 19:08 abelbraaksma

@abelbraaksma print is not converting anything to a string. It merely displays values to the console, using the normal CLR/OS mapping. The original proposal is to constrain the function print to only displaying strings.

Here is the signature: [<CompiledName("Print")>] val print: text: string -> unit

That is a very restrictive API. The name print is most definitely the "good one". We should use it for the best printing api we can design, if it is super restrictive or constraining and we decide to do something more expansive then we will curse what we have.

I would absolutely expect to be able to print any type using it without adornment.

I.e.

let x = DateTime.Today
print x;;

Producing this output in fsi would enrage me every single time it happened:

  print x;;
  ------^

stdin(11,7): error FS0001: This expression was expected to have type
    'string'    
but here has type
    'DateTime'    

> 

I would be very unhappy to have to do this to print out a simple value, especially in an api whose purpose is to make print more accessible, which is fairly heavyweight on the concept count.

let x = DateTime.Today
println $"{x}"

8/14/2022 12:00:00 AM
val x: DateTime = 8/14/2022 12:00:00 AM
val it: unit = ()

Also: the string interpolation provides exactly the same output:

let x = DateTime.Today

Thread.CurrentThread.CurrentCulture <- new CultureInfo("de-DE");;
println ""
print "DE: println   : "; println $"{x}"
print "DE: WriteLine : "; Console.WriteLine(x)

Thread.CurrentThread.CurrentCulture <- new CultureInfo("us-US");;
println ""
print "US: println   : "; println $"{x}"
print "US: WriteLine : "; Console.WriteLine(x)

Produces:

val x: System.DateTime = 14.08.2022 00:00:00
val it: unit = ()

> 
DE: println   : 14.08.2022 00:00:00
DE: WriteLine : 14.08.2022 00:00:00
val it: unit = ()

> 
US: println   : 8/14/2022 12:00:00 AM
US: WriteLine : 8/14/2022 12:00:00 AM
val it: unit = ()

I assert that apis like:

    let print<'T> (v'T) =
        Console.Out.Write(v)

    let println<'T> (v:'T) =
        Console.Out.WriteLine(v)

Would be more useful especially in scripting and learning environments and would also not trigger me, every time the compiler complained.

Although I have to say, last night I got triggered a bunch by printline x complaining somehow the ln abbreviation, which I actually prefer was running away from my fingers.

Anyway, those are my thoughts, on the Api.

KevinRansom avatar Aug 15 '22 04:08 KevinRansom

Also: the string interpolation provides exactly the same output:

@KevinRansom I know. That was my point above. As soon as you'd use $"%f{x}", it is not the same anymore, nor is printfn "%f" x. They use InvariantCulture.

That is a very restrictive API. The name print is most definitely the "good one". We should use it for the best printing api we can design,

and

I would be very unhappy to have to do this to print out a simple value

I agree, I didn't intend to suggest otherwise. Sorry if my message was confusing.

It merely displays values to the console, using the normal CLR/OS mapping

If we consider print and println merely a shortcut to Console.WriteLine, then yes.

My confusion is that print and printfn appear very similar to me: turn something into a string and echo it to the console. It's mostly a semantic discussion. I have no problem to have print behave differently than printfn (the latter uses InvariantCulture).

Considering that we are already confused by this situation, and you yourself above assuming that string interpolation always takes Culture into account (it doesn't: it depends whether you use typed, or untyped string interpolation), I guess we can go either way.

There's certainly something to say for "print is short for Console.Write.". It's clear. And yes, we would than have the surprise that output is different on different locations, but that's also how Console.Write works.

abelbraaksma avatar Aug 15 '22 11:08 abelbraaksma

@abelbraaksma

Okay mate, thanks for sticking with this and helping get me educated: Given this code in a code review:

open System
open System.Globalization

System.Globalization.CultureInfo.CurrentCulture <- System.Globalization.CultureInfo.GetCultureInfo("nl-NL")

let a = $"The speed of light is {speedOfLight:N2} km/s.";
let b = $"The speed of light is {speedOfLight} km/s.";
let c = $"The speed of light is %f{speedOfLight} km/s.";

Console.WriteLine("a: " + a);
Console.WriteLine("b: " + b);
Console.WriteLine("c: " + c);

I don't know who is going to predict the output:

a: The speed of light is 299.792,46 km/s.
b: The speed of light is 299792,458 km/s.
c: The speed of light is 299792.458000 km/s.

This was a rabbit hole, I now wish I had never jumped in. I suppose we can be grateful that %f at least matches printf/printfn which was obviously it's intent. I guess coding guidelines that demand .Net framework style formatting is the way to go. Although I see a bug farm for any developer working on software that is globalized, perhaps even a warning ... oops you used printf style formats, it may not work the way you expect in other locales.

KevinRansom avatar Aug 16 '22 00:08 KevinRansom

I guess coding guidelines that demand .Net framework style formatting is the way to go

We (at my current company) actually have a coding guideline for the opposite. Use F# type-safe formatting (i.e., use $"Price: %f{x}", not $"Price: {x}"). If we need localization, we require programmers to be explicit about it, i.e. by using ToString() overloads. But in 99% of coding I've seen, auto-localization leads to bugs and (even in the .NET Framework days) opted for a policy to diligently use InvariantCulture and specific formatting strings as much as possible, unless specific cultures (i.e. for user-facing screens or something) was warranted.

This was a rabbit hole, I now wish I had never jumped in.

Yeah, it's a tricky mess. I've fallen for the trap in your "code review" example many times. Just as often I asked a programmer (from USA): "how do you think this will render on my machine, you think your test will pass"? Since I'm in NL, it often won't.

oops you used printf style formats, it may not work the way you expect in other locales.

The real question is: what do people expect? Most people are surprised that auto-internationalization is even a thing. If you write an English website, you don't want the numbers and dates to be localized in the native language of the user. And yes, this happens a lot: accidentally localized pages :).

abelbraaksma avatar Aug 16 '22 02:08 abelbraaksma

One other big pool of bugs with these pitfals: serialization. We just uncovered major bugs where some programmers hand-crafted JSON serialization. Little did they know that the string interpolation would make their output different, depending where the code ran. it wasn't machine-parsable anymore.

abelbraaksma avatar Aug 16 '22 02:08 abelbraaksma

@albert-du, I see you are updating this, and the RFC. But I'm not sure if there's consensus. @KevinRansom, what do you think? @dsyme, TLDR: we ended up discussing print: 'T -> unit and println: 'T -> unit vs the current implementation of print: string -> unit and println: string -> unit.

Bottom line of that discussion is, if we go the 'T approach, would we follow the serialization semantics of string and/or printfn "%A" (or printfn "%O"), or would we follow the semantics of typed (similar to any of prinfn "%i" | printfn "%f" etc) or untyped string interpolation (similar to printfn "%O" most of the time)?

Differences are abound. I.e., do we want "None" and "Some 42" be output, or "" and "Some(42)" (the latter is the default of Console.Write). And do we want to use InvariantCulture (used in string and printfn) added or not (used only in untyped string interpolation)?

W.r.t Invariant Culture, if we do not go that route, it means that, just like Console.Write by itself would do, that the output differs depending where you are in the world. Most existing F# functions use Invariant Culture now (printfn, sprintf, typed string interpolation and string).

We could vote. Choices:

  1. (vote 👍) As with string >> Console.Write. This is the same as #2 below, but with InvariantCulture.
  2. (vote 😄) As with Console.Write (x: obj), i.e. null and None both become "" and Some x becomes "Some(42)". Output differs per locale. This is the same as untyped string interpolation: fun x -> $"{x}" >> Console.Write .
  3. (vote ❤️) As with printfn "%A" >> Console.Write, i.e. null -> "<null>", "None -> "None" and Some 42 -> "Some 42". Has Invariant Culture. Floats are output using their simplest form: 42.1 -> "42.1".
  4. (vote 🚀) As with printfn "%f >> Console.Write, i.e. type-directed. Has Invariant Culture. Differs slightly for individual type, i.e. floats are output with 6 significant digits.
  5. (vote: 😕) Stick with current implementation, i.e., require a string. This is let print (x: string) = Console.Write.

abelbraaksma avatar Aug 16 '22 11:08 abelbraaksma

Please note that ideally design discussion on the RFC would happen on the RFC discussion thread: https://github.com/fsharp/fslang-design/discussions/675

Still, let's continue here for now

dsyme avatar Aug 16 '22 13:08 dsyme

The approved RFC was for print: string -> unit. The discussion here led to a proposed change to

val print: 'T -> unit

I don't think we can realistically make this change. A generic print raises a huge number of issues - most of them touched on above - thank you to @abelbraaksma for the detailed knowledge on this

  • localization
  • correspondence with %A
  • whether multi-line output is used by default and line-width if it is
  • printing of null values
  • printing of option values

Plus there's the problem that the generic functions are just less safe - e.g. you can easily end up printing an unapplied function value, which results in nothing useful, just some garbage ToString of a closure type.

I honestly think it's print: string -> unit or nothing. The user must then clarify the intent by various means.


Aside information about localization, please follow up at https://github.com/fsharp/fslang-suggestions/issues/897

The design intent of FSharp.Core functionality has always been "use invariant culture unless explicitly specificed otherwise".

  • This rule is kept by %d, %f and friends which always use InvariantCulture

  • This rule is kept by string x which always uses InvariantCulture for IFormattable values.

  • This rule is generally kept by %A. The intent here is to always use invariant culture - this is reasonable as it's main use is in debugging output for structured data. However, a long standing omission is that %A uses .ToString() for .NET IFormattable types that are not well known to FSharp.Core. This is noted in the documentation but means there is a discrepency between string and %A for things like System.DateTime. I actually think we should fix this and make %A always use InvariantCulture for IFormattable values, just like string does. The spec of %A has changed a little from time to time and since it is human-facing output I think it is correct to make it more consistently invariant culture.

  • This rule is broken by %O and unadorned string interpolation $"...{x}..." which always uses obj.ToString() which in turn will generally use .NET localization. It is definitely arguable that this should have used invariant culture but I believe the breaking change is too significant to change this. It also means %O and unadorned string interpolation are the primary way to implicitly get localized output.

For visual outputs from F# Interactive, the user can specify fsi.FormatProvider but the default is InvariantCulture. An argument could be made that the default should be localized, but it isn't and I don't think we should change that now.

Some of these issues are captured here: https://github.com/fsharp/fslang-suggestions/issues/897

Finally, I'm surprised it's never been suggested that %f, %A and friends take an option to pick up the current localization according to standard .NET rules, e.g. %$d, %$A (or some other character). There could also be a corresponding option to suppress it for %O, getting back to invariant, e.g. %-O or $"....%-O{x}....."

I'd encourage people to review https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/plaintext-formatting and contribute to it. There should really be a specific separate section on locales - there are mentions in the doc but a separate section should cover the above.

dsyme avatar Aug 16 '22 14:08 dsyme

I reverted the implementation back to the original rfc spec

albert-du avatar Aug 16 '22 20:08 albert-du

@albert-du This discussion has raised important questions. I'll do my best to capture these as unresolved questions in the text of the RFC.

I think it's best to put the implementation on hold until we resolve those design questions.

In private conversation Kevin is pretty adamant that the "use print to any data" scenario is under-represented in the discussion and that we shouldn't introduce a print taking just a string if we don't address this scenario - which would require solving the much larger and much harder question of "should print be generic and if so what is its specification". This is so fundamental that it will be hard to resolve it, and unfortunately we may need to put this RFC on ice.

This is partly my fault, because in the design process I didn't adequately predict or list out the scope of the unresolved issues in the RFC.

dsyme avatar Aug 16 '22 21:08 dsyme