FSharpPlus icon indicating copy to clipboard operation
FSharpPlus copied to clipboard

Int32.TryParse and F#+ tryParse give different results

Open abelbraaksma opened this issue 2 years ago • 6 comments

Description

Basically (tested with culture on en-US).

> let value: int option = tryParse "123,45";;
val value: int option = Some 12345

> Int32.TryParse "123,45";;
val it: bool * int = (false, 0)

In other words, integers cannot have thousand-separators with Int32.TryParse, but can with tryParse.

Conversely, it doesn't allow decimals, which is good:

> let value: int option = tryParse "123.45";;
val value: int option = None

> Int32.TryParse "123.45";;
val it: bool * int = (false, 0)

Likewise there's a similar difference with e-notation:

> let value: int option = tryParse "12345e4";;
val value: int option = Some 123450000

> Int32.TryParse "12345e4";;
val it: bool * int = (false, 0)

Repro steps

See above.

Expected behavior

From the description of the method, you'd expect the same behavior as the .NET functions, but they allow certain illegal values, or at least ambiguous, considering how .NET parses numbers.

Actual behavior

See above. This is caused by NumberStyles.Any, which is not used by Int32.TryParse or Decimal.TryParse. These use NumberStyles.Integer and NumberStyles.Decimal respectively (see source of .NET), which limit the allowed range.

Known workarounds

Create your own types or call Int32.TryParse directly, or create your own helper function.

Related information

Most recent F#+ and tested with .NET 6, but assuming it is the same on .NET 7.

abelbraaksma avatar Mar 27 '23 14:03 abelbraaksma

Related: regardless of whether we'd fix this issue or not, perhaps we can update the documentation to specify that parse and tryParse uses InvariantCulture. Currently, it not being mentioned, may lead people to believe it works the same as XXX.Parse/TryParse from the BCL, that is, that it changes behavior depending on the current culture.

abelbraaksma avatar Mar 27 '23 14:03 abelbraaksma

Yes, I think it makes sense to have it documented as using invariant culture.

wallymathieu avatar Mar 28 '23 17:03 wallymathieu

As always with parsers there's the question of how smart my parser should be, where do you draw the line?

Our current parsers uses NumberStyles.Any which makes them very flexible, you could argue they should be smarter and make them able to parse "thirtyseven" as 37, or make them very strict and expect "clean" numbers, without any , or . separator, or redundant + signs.

The problem is, parsers are kind of the inverse of printers, but it's not bijective as there are many ways to print and many ways to parse.

We can only implement one way to print, however making parsers so strict that they only can parse that specific printing way could be a nice property but a pain to work with in real world scenarios.

So we can come up with this property: print >> parse = id (and we could add FsChecks for it) which doesn't imply parse >> print = id.

How different styles we allow? That's an interesting discussion (thanks for kicking it off), I think we should think in practical cases.

Then as you say we have the compatibility problem, if we eventually decide to make them more strict, but we can always implement a parseStrict something like ParseExact in order to satisfy cases where the developer is more interested in validating than getting results as far as practical.

gusty avatar Apr 02 '23 16:04 gusty

Our current parsers uses NumberStyles.Any

While I agree with everything you say, the problem with Any is that invalid numbers are parsed, which I don't think is part of what most people consider valid input.

Since this is about calling the parse methods of .NET, I think, apart from forcing Invariant Culture, that we should stay as close as possible to what .NET itself implements, to stick with the "principle of least suprise".

Certainly, creating invalid parsing functions will not have been the goal here. But changing this leads to a backwards compat issue. I'd vote for your proposal with parseStrict.

Note that this is not the same as ParseExact, as the standard Parse functions on number types are quite strict already. The only big problem with these is the absence of invariant culture in the default behavior.

abelbraaksma avatar Apr 17 '23 20:04 abelbraaksma

OK, let's go with parseStrict, then the question remains for V2, I mean we can do breaking changes then but still which should be the default ? We have some time to think about it.

gusty avatar Apr 18 '23 05:04 gusty

I would set the defaults to what I wrote before:

we should stay as close as possible to what .NET itself implements, to stick with the "principle of least suprise".

not sure if parseStrict is the ideal choice, but I can't think of anything better right now.

abelbraaksma avatar Jul 03 '23 12:07 abelbraaksma