fsharp icon indicating copy to clipboard operation
fsharp copied to clipboard

Seq.empty renders as `"EmptyEnumerable"`

Open bartelink opened this issue 1 year ago • 7 comments

Seq.empty<MyType> renders as the string "EmptyEnumerable" when fed through AspNetCore 8 ObjectResult

The same does not happen for Seq.choose and other such functions when they yield an empty sequence, as they yield a different internal impl type wrapping their output, which in my context winds up producing [] as required

Expected behavior

render as []

Actual behavior

renders as "EmptyEnumerable"

Known workarounds

use Array.empty<MyType> instead

Related information

  • happen to be using JSON.NET and a random mix of middlewares and don't have time to root cause deeper, but it surprised me
  • MyType has a JsonConverter registered
  • Did not yet try with System.Linq.Enumerable.Empty<MyType>()

bartelink avatar Oct 10 '24 07:10 bartelink

I realize this is horrendously incomplete, but I didn't find the condition to be easy to search for, and saying nothing is not ideal either.

If it was me fixing as a quick win without root causing / understanding the required interfaces/moving parts more deeply, I'd make Seq.empty delegate to Array.empty (which is my actual workaround for now), but obviously there's plenty to be won from an optimal implementation too.

Here's hoping I eventually get to the point of circling back with a trimmed down repro that identifies whether this is a JSON.NET problem, a JsonConverter+JSON.NET problem, or only solvable by implementing a relevant interface on EmptyEnumerable

bartelink avatar Oct 10 '24 08:10 bartelink

Interesting:

> let es: seq<int> = [];;
val es: int seq = []

> let es2: seq<int> = Seq.empty;;
val es2: int seq

> es;;
val it: int seq = []

> es2;;
val it: int seq = seq []

Martin521 avatar Oct 10 '24 08:10 Martin521

Yeah, it'd definitely be a more widespread problem if it was something fundamental. There are a lot of potential things in play with ObjectResult, JSON.NET and JsonConverters in the picture - JSON.NET covers a heck of a lot of scenarios and has plenty idiosyncracies so not ruling anything out. The biggest thing I can do to narrow it down is use Enumerable.Empty instead in my context and see whether it happens to work correctly, but this unfortunately hasn't made it to the top of the TODO list atm....

bartelink avatar Oct 10 '24 08:10 bartelink

I don't think it is in aspnetcore.

> let es: seq<int> = [];;
val es: int seq = []

> let es2: seq<int> = Seq.empty;;
val es2: int seq

> printfn $"{es}";;
[]
val it: unit = ()

> printfn $"{es2}";;
Microsoft.FSharp.Collections.IEnumerator+EmptyEnumerable`1[System.Int32]
val it: unit = ()

But tbh I find EmptyEnumerable more correct than [].

Martin521 avatar Oct 10 '24 10:10 Martin521

I'm not ruling anything in or out In my context, [] is much more 'correct' than a sequence mapping to a string JSON value The specific reasons why fsi will render in a given manner are definitely something that's interesting and relevant, but for me isn't an open and shut answer either.

> printfn $"{System.Linq.Enumerable.Empty<unit>()}"
System.Linq.EmptyPartition`1[Microsoft.FSharp.Core.Unit]

(But I have yet to test how that renders either - All I know for a fact is that Array.empty renders equivalent to the wrapper that Seq.choose yields)

bartelink avatar Oct 10 '24 10:10 bartelink

F# type checking infers es in the above example to be of type int list and es2 to be of type ÈmptyEnumerable. I don't think there is anything wrong with it. I guess that in the case that you mention above, using Seq.choose, there is a reason why type checking infers a list or array.

Martin521 avatar Oct 10 '24 16:10 Martin521

Yeah, it's not entirely/only about the static type from the perspective of the compiler though - ultimately ObjectResult is not strongly typed (the data is an obj), so it's about the interfaces and other aspects of the wrapper type; that's what dictates how JSON.NET and/or AspNetCore's wiring ends up rendering the result. (Seq.empty casts the internalt wrapper type to seq<'t>, but that turns out not to be enough)

bartelink avatar Oct 10 '24 16:10 bartelink

I've tried it with TypedResults.Ok(Seq.empty) and it fails with the exception.

System.NotSupportedException: F# discriminated union serialization is not supported. Consider authoring a custom converter for the type. The unsupported member type is located on type 'Microsoft.FSharp.Collections.IEnumerator+EmptyEnumerable`1[System.Object]'.

While I also expect it to be serialized to [], so my suggestion is to change single case DU to Array.empty in FSharp.Core implementation

Lanayx avatar Nov 26 '24 16:11 Lanayx

For F# DUs which also implement IEnumrable, S.T.J could use IEnumerableDefaultConverter perharps?

Right now, F# DUs are not supported, no matter what the interface. (even though Lists and Options are also DUs, but they are being special cased by the runtime type detection)

https://github.com/dotnet/runtime/blob/aba819956355c47a9dddbcba8a1aca591f176166/src/libraries/System.Text.Json/src/System/Text/Json/Serialization/Converters/FSharp/FSharpTypeConverterFactory.cs#L69

T-Gro avatar Nov 29 '24 12:11 T-Gro

It is a leaky abstraction due to loss of type (obj). So even though the compile-time type is just seq<T> and that should work, the loss of types means that reflection kicks in and the runtime type (implementation detail) is being picked.

T-Gro avatar Nov 29 '24 12:11 T-Gro

The point here though is that independent of STJ vs any other serialization mechanism being wired into the default rendering of an ObjectResult or similar abstractions in ASP.NET, mainstream expectations of how to render an obj that happens to be enumerable are a pretty loose contract. And that contract happens to be satisfied by Array.empty due to the particulars of it's implementation, whereas for Seq.empty, it is not. The casting vote for me would be whether Enumerable.Empty<'t> renders correctly as [] vs throwing and/or rendering as the string "EmptyEnumerable"

bartelink avatar Nov 29 '24 12:11 bartelink

Currently, Seq.empty is rendering "correctly" as EmptyEnumerable from the perspective of STJ (it chooses to take a type name as its representation), since it's a single-case DU:

// seqcore.fs:

[<NoEquality; NoComparison>]
type EmptyEnumerable<'T> =

    | EmptyEnumerable

    interface IEnumerable<'T> with
        member _.GetEnumerator() = Empty<'T>()

    interface IEnumerable with
        member _.GetEnumerator() = (Empty<'T>() :> IEnumerator)

// seq.fs:

let empty<'T> = (EmptyEnumerable :> seq<'T>)

Changing implementation of Seq.empty would be a breaking change.

vzarytovskii avatar Nov 29 '24 12:11 vzarytovskii

I get that. But for me STJ and its specfic handling is a red herring here

The point is more that you don't expect as an ASP.NET user to get different results for [||]/[]/Seq.empty/Array.empty.

This manifested for me as a very strange bug where I had a Controller body that was doing a Seq.choose and working, but the special case returning Seq.empty acted differently

My hackaround was to use Array.empty. List.empty would have worked too. The reason I logged this is that I'd be surprised if such coincidences would happen if Seq.empty's internal impl e.g. delegated to Enumerable.Empty.

Changing implementation of Seq.empty would be a breaking change.

As in making it return a different impl type, or giving it different semantics? I accept that one shouldn't just change the impl for a laugh given the subtle effects that any such change in such a core type would have. But by the same token, I bet Enumerable.Empty has had internal change and churn e.g. as part of perf tuning over time.

bartelink avatar Nov 29 '24 13:11 bartelink

The reason I logged this is that I'd be surprised if such coincidences would happen if Seq.empty's internal impl e.g. delegated to Enumerable.Empty.

Yeah, it probably gets confused as it's not get upcasted to IEnumerable<_>. I wondering what would happen if you pass it as Seq.empty :> IEnumerable<_>?

As in making it return a different impl type, or giving it different semantics? I accept that one shouldn't just change the impl for a laugh given the subtle effects that any such change in such a core type would have. But by the same token, I bet Enumerable.Empty has had internal change and churn e.g. as part of perf tuning over time.

Mostly first. Different return type will make it backwards incompatible (mostly talking about binary compatibility).

vzarytovskii avatar Nov 29 '24 13:11 vzarytovskii

Yeah, it probably gets confused as it's not get upcasted to IEnumerable<>. I wondering what would happen if you pass it as Seq.empty :> IEnumerable<>?

ASP.NET Has an internal ObjectResult that carries it as a 'boxed' obj. I'm sure stuff works fine if you let type inference have the result type end up as Task<'T>.

Different return type will make it backwards incompatible (mostly talking about binary compatibility).

OK but in this case the static result type is seq<'t>, and the proposal is not to change that - it's about the internal type that's yielded and the additional interfaces/characteristics that become apparent via other mechanisms. I believe the impl type in question is internal so there won't be a dependency on that, but it's easy to imagine other random dependencies like expectations of other related interfaces being implemented as you suggest.

bartelink avatar Nov 29 '24 13:11 bartelink

Somebody produces an object that has seq<_> / IEnumerable<_> as runtime type. If this sequence is empty, than the correct ToString() representation is EmptyEnumerable. Why should it render as [] or [| |]? It is not a list nor array. So, the issue is not in how an empty seq is rendered, but that somebody somewhere made it a (runtime!) seq rather than a list or array. (Or, if we are not talking about ToString(), but about JSON, it is an issue of the json library.)

Martin521 avatar Nov 29 '24 13:11 Martin521

You can go make different issues if you like. My problem that made me log this is that:

[<HttpGet>
member _.Thing(): Task<IActionResult> = task {
    if ... then return source |> Seq.choose f |> ObjectResult
    else return Seq.empty |> ObjectResult }

Causes a suprising and annoying result, regardless of who did what right or wrong (i.e. it works fine for all the test cases until you fall into the null handling, and then you get the strangest outcome.

And that I suspect that replacing Seq.empty with System.Linq.Enumerable.Empty() will fix it. Yes there are probably plenty other ASP.NET IActionResult impls that won't expose this problem, but in general that's a tiny impl detail off in some utils lib that I don't want to have front of mind.

That's an annoying real world usage thing which can be fixed either in ASP.NET or in FSharp.Core. And I don't see a PR that special cases EmptyEnumerable going into the ASP.NET acceptance test suite happening.

I won't consider this issue closed until such time as the above Just Works as it should.

bartelink avatar Nov 29 '24 13:11 bartelink

I really think this has to do with something (S.T.J. most likely, being used by aspnetcore) using a runtime detection. And when it sees an F# DU at runtime (even though the compile-time type is just seq), it simply calls .ToString().

Whereas if it gets any other seq, which just happens not to be a DU (which should be an implementation detail, but the implementation leaks by runtime inspection via reflection!), it uses a different codepath to render it.

If anyone can confirm that aspnetcore uses S.T.J to produce ObjectResult's string representation, and that it fallbacks to .ToString() when it cannot find a converter - we can prepare a specific fix for it. (and the fix will help any other F# DU which implements IEnumerable, not just a single one)

T-Gro avatar Nov 29 '24 14:11 T-Gro

In my case (mentioned in the OP, the rendering is via Newtonsoft.Json). In Lanayx's case, it's STJ.

The default OOTB has been STJ for a while, i.e. you depend on and opt into a backcompat package to use NSJ

There's a decent chance that in BOTH cases you end up in the same situation: Array.empty just works, Seq.empty just fails.

It may well be that the model binding/IActionResult impls stuff could be more generic and/or less flaky

But, the bottom line is still for me that if Seq.empty did not internally happen to be implemented as an interface on a DU, we would not be having this conversation

bartelink avatar Nov 29 '24 14:11 bartelink

I really think this has to do with something (S.T.J. most likely, being used by aspnetcore) using a runtime detection. And when it sees an F# DU at runtime (even though the compile-time type is just seq), it simply calls .ToString().

In the above example it is not related to a DU, but already happens in Seq.empty |> ObjectResult. ObjectResult is not generic and stores the empty sequence as obj. So, the issue is clearly not in FSharp.Core, but in STJ.

STJ is well known to not well support F# DUs. That's why packages like FSharp.SystemTextJson exist.

Unfortunately, the discussion there is not going well.

Martin521 avatar Nov 29 '24 15:11 Martin521

But, the bottom line is still for me that if Seq.empty did not internally happen to be implemented as an interface on a DU, we would not be having this conversation

I think a reasonable boundary is: A language agnostic library doing runtime inspection should not dictate internal implementation details of a standard library. Seq.empty meets all contracts of IEnumerable<> and is fully serializable (in one direction at least) just like any other serializable.

The fact that specific libraries are choosing to detect its's metadata (attributes on a reflection-inspected runtime type, which is even internal!) and handle it differently then any other IEnumerable is something fixable and we can support any such fix if desired.

T-Gro avatar Nov 29 '24 15:11 T-Gro

Those are benefits of changing SDU to array in F# Core:

  • Fixes related bugs in all current and future serialization libraries
  • Improves serialization performance (since serializer can detect array and use .Length instead of trying to enumerate collection)
  • Improves .ToArray performance (which is a very often operation)
  • Improves new users experience, they won't have "surprising bugs"
  • using DU for a collection type doesn't make sense anyway, so it's better for future supportability

Downsides:

  • Exceptions in code that used private reflection
  • Exceptions in code that used binary serializer and stored that somewhere (deprecated since .NET 8)

So in my mind trying to protect the buggy code shouldn't be the goal, while increased stability in performance should be. After all it's Liskov substitution principle - once you expose interface, you are free to change implementation details.

Lanayx avatar Nov 29 '24 16:11 Lanayx

So in my mind trying to protect the buggy code shouldn't be the goal

You mean one in stj and co? Agree.

EmptyEnumerable IS enumerable. As much enumerable as List, Array, or any other which implements IEnumerable<_>

vzarytovskii avatar Nov 29 '24 16:11 vzarytovskii

NSJ has been serializing JSON results in ASP.NET for a long time Can we leave STJ out of it please. (And FSharp.STJ is whether or not that fixes matters again is not relevant either)

There are many ways I can fix it; the point is out of the box for Newtonsoft it has a surprising trap.

I'm not saying that anyone broke any law But I did not get to fall into the pit of success If the impl happened to be more direct than an interace on a DU, then I would have

I don't care how the problem is fixed, but know that the problem from my perspective is "F# has a funny internal wart that happens to cause a real problem for a real usage". I was seeking to imply "and we should probably remove that unless there's a really good reason for it, or Enumerable.Empty has the same problem". I am not interested in any legal discussion; I fully understand that Seq.empty returns seq<'t> and that it is entitled to do so as it sees fit.

I am also not trying to say that web frameworks that use obj internally are the last word. But they are common.

bartelink avatar Nov 29 '24 17:11 bartelink

"F# has a funny internal wart that happens to cause a real problem for a real usage".

If you look at the generated IL, it's just a plain empty class implementing the IEnumerable interface. The whole reason this happens is that some libraries are voluntarily choosing to look for [CompilationMapping(SourceConstructFlags.SumType)] attribute on the class, which is just metadata. I don't think there is anything else funny about it for a .NET library.

"and we should probably remove that unless there's a really good reason for it, or Enumerable.Empty has the same problem".

A change has to justify itself, not the other way around. @Lanayx mentioned some other arguments - if they can be demonstrated and they indeed bring a notice-able benefit, I would personally be more open to the change. For example a demonstratable better perf in common scenarios (examples: plain creation+return, LINQ's .ToArray(), Seq.toArray, Seq.toList, serialization via a chosen tool, iteration).

Those might all be good arguments and will justify the change. Pleasing a library which is effectively doing runtime detection of an attribute does not make it as a reason in my opinion.

T-Gro avatar Nov 29 '24 17:11 T-Gro

For example a demonstratable better perf in common scenarios (examples: plain creation+return, LINQ's .ToArray(), Seq.toArray, Seq.toList, serialization via a chosen tool, iteration).

To be fair, it's hard to imagine improvements can be noticeable, especially at runtime when JIT kicks in. Will be curious too see what are those potential improvements as well. Also need to think of there will be any implications in sigdata/inlining when we build versus one fslib, and run with another version. Probably not, but will need verifying.

Regarding "funny internal wart" - this is definitely not what's happening. Codegen is nothing special, just certain libraries serialize things in certain way.

vzarytovskii avatar Nov 29 '24 18:11 vzarytovskii

Regarding "funny internal wart" - this is definitely not what's happening. Codegen is nothing special, just certain libraries serialize things in certain way.

I get that the codegen is nothing special, and that the IEnumerable impl is fine. My only point is that Enumeable.Empty is unlikely to fall prey to this. The 'funny internal wart' is the observed effect - two different impls that dont work out of the box. And the one and only problem is "F# surprisingly OOTB has a wacky thing that happens when you try to use it for normal programming returning an empty sequence".

And the perf has nothing to do with it. I was just suggesting that Enumerable.Empty impl has had perf treatments over time even though its ostensibly a very simple thing that doesnt justify messing with. I personally would settle for Seq.empty delegating to Array.empty and calling it a day myself! (Mainly kidding - I know there are plenty reasons why that wouldn't Just Work)

bartelink avatar Nov 29 '24 18:11 bartelink

Regarding "funny internal wart" - this is definitely not what's happening. Codegen is nothing special, just certain libraries serialize things in certain way.

I get that the codegen is nothing special, and that the IEnumerable impl is fine. My only point is that Enumeable.Empty is unlikely to fall prey to this. The 'funny internal wart' is the observed effect - two different impls that dont work out of the box. And the one and only problem is "F# surprisingly OOTB has a wacky thing that happens when you try to use it for normal programming returning an empty sequence".

And the perf has nothing to do with it. I was just suggesting that Enumerable.Empty impl has had perf treatments over time even though its ostensibly a very simple thing that doesnt justify messing with. I personally would settle for Seq.empty delegating to Array.empty and calling it a day myself! (Mainly kidding - I know there are plenty reasons why that wouldn't Just Work)

I guess we're going in circles here. I understand that serialisation is unexpected for something out-of-the-box in F#. I understand that either we or the library will have to do something. I'm just trying to say that we should really think good about what we want to do, and think about potential side-effects and implications for existing and old tooling.

vzarytovskii avatar Nov 29 '24 18:11 vzarytovskii

Serialization benchmark using System.Text.Json.JsonSerializer.Serialize, almost double memory win with array

let seqEmpty = Seq.empty<int>
let arrEmpty = Array.empty :> seq<int>

[<Benchmark>]
member this.SeqEmpty() =
    for _ in 1..1000 do
        System.Text.Json.JsonSerializer.Serialize(seqEmpty) |> ignore

[<Benchmark>]
member this.ArrayEmpty() =
    for _ in 1..1000 do
        System.Text.Json.JsonSerializer.Serialize(arrEmpty) |> ignore
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4460/23H2/2023Update/SunValley3)
AMD Ryzen 5 5600H with Radeon Graphics, 1 CPU, 12 logical and 6 physical cores
.NET SDK 9.0.100
  [Host]     : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2 DEBUG
  DefaultJob : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2


| Method     | Mean     | Error    | StdDev   | Gen0   | Allocated |
|----------- |---------:|---------:|---------:|-------:|----------:|
| SeqEmpty   | 90.78 us | 1.681 us | 1.651 us | 6.5918 |  54.69 KB |
| ArrayEmpty | 89.08 us | 1.742 us | 1.544 us | 3.7842 |  31.25 KB |

Interestingly System.Text.Json.JsonSerializer.Serialize(Seq.empty) gives "[]"

Lanayx avatar Dec 02 '24 03:12 Lanayx

Suprisingly Seq.toArray works better for Seq.empty, but I believe it could be fixed for this special case

let seqEmpty = Seq.empty<int>
let arrEmpty = Array.empty :> seq<int>

[<Benchmark>]
member this.SeqEmpty() =
    for _ in 1..1000 do
        seqEmpty |> Seq.toArray |> ignore

[<Benchmark>]
member this.ArrayEmpty() =
    for _ in 1..1000 do
        arrEmpty |> Seq.toArray |> ignore
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4460/23H2/2023Update/SunValley3)
AMD Ryzen 5 5600H with Radeon Graphics, 1 CPU, 12 logical and 6 physical cores
.NET SDK 9.0.100
  [Host]     : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2 DEBUG
  DefaultJob : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2


| Method     | Mean      | Error     | StdDev    | Gen0   | Allocated |
|----------- |----------:|----------:|----------:|-------:|----------:|
| SeqEmpty   |  5.944 us | 0.1010 us | 0.0789 us | 2.8687 |  23.44 KB |
| ArrayEmpty | 61.335 us | 0.9962 us | 0.9318 us | 2.8076 |  23.44 KB |

UPDATED: Indeed, the fixed version is much faster with zero allocations, current implemention array.Clone() is extremely slow

module My =
    let toArray (source: seq<'T>) =
        match source with
        | :? ('T array) as res -> res.ToArray()
        | _ -> source |> Seq.toArray
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4460/23H2/2023Update/SunValley3)
AMD Ryzen 5 5600H with Radeon Graphics, 1 CPU, 12 logical and 6 physical cores
.NET SDK 9.0.100
  [Host]     : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2 DEBUG
  DefaultJob : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2


| Method          | Mean      | Error     | StdDev    | Gen0   | Allocated |
|---------------- |----------:|----------:|----------:|-------:|----------:|
| SeqToArray      |  6.263 us | 0.0795 us | 0.0664 us | 2.8687 |   24000 B |
| ArrayToArray    | 61.131 us | 1.2160 us | 1.0780 us | 2.8076 |   24000 B |
| FixedArrToArray |  4.865 us | 0.0883 us | 0.0826 us |      - |         - |

Lanayx avatar Dec 02 '24 03:12 Lanayx