fslang-suggestions icon indicating copy to clipboard operation
fslang-suggestions copied to clipboard

Add dedicated method for ranges translation in CE

Open gsomix opened this issue 4 years ago • 8 comments

Addition of InlineIfLambda attribute in F# 6 opened the way for high-performance computation expressions (imagine one for zero allocation ZString builder). But we can do better. I propose we change translation rules to allow optimisations of for-loops in CE.

Consider following code using list.fs builder:

listc { for i = 0 to 10 do i*i }

According to F# spec compiler uses following rules to translate this expression:

T(for x = e1 to e2 do ce, V, C, q) = T(for x in e1 .. e2 do ce, V, C, q) (*)
T(for x in e do ce, V, C, q) = T(ce, V ⊕ {x}, λv.C(b.For(src(e), fun x -> v)), q) (**)

where e1 .. e2 (*) is range operator from standard library that creates sequence of values and For (**) is defined as:

member inline b.For(
    sequence: seq<'TElement>, 
    [<InlineIfLambda>] body: 'TElement -> ListBuilderCode<'T>) 
    : ListBuilderCode<'T> =
    b.Using (sequence.GetEnumerator(), 
        (fun e -> b.While((fun () -> e.MoveNext()), (fun sm -> (body e.Current).Invoke &sm))))

So simple for-loop allocates sequence and calls Using method that might be undesirable in high-performance code. I propose to add new translation rule:

T(for x in e1 .. e2 do ce, V, C, q) = T(for x in range(e1, e2) do ce, V, C, q)

where range denotes b.Range(e1, e2) if builder b contains Range method. Otherwise, range(e1, e2) denotes e1 .. e2.

It allows to implement same optimisation compiler does for loops where

for x in 1..10 do
    printfn $"{x}"

compiles into effective while loop.

Let's draft it! First add Range method:

member inline b.Range(e1: int, e1: int) = Range(Index(e1), Index(e2))

Then, add For overload:

member inline b.For(
    range: Range, 
    [<InlineIfLambda>] body: int -> ListBuilderCode<'T>)  =
    ListBuilderCode<_>(fun sm -> 
        for i = range.Start to range.End do
            (body i).Invoke &sm)

The existing way of approaching this problem in F# is override range operator (..). Surprisingly CE uses operator from the context!

Pros and Cons

The advantage of making this adjustment to F# is allowing more high-performance scenarios for CE.

The disadvantages of making this adjustment to F# are increasing complexity of translations rules and compiler.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): L

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • [x] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • [x] I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • [x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • [x] This is not a breaking change to the F# language design
  • [x] I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the :+1: emoji on this issue. These counts are used to generally order the suggestions by engagement.

gsomix avatar Feb 14 '22 17:02 gsomix

What about ranges with steps?

Happypig375 avatar Feb 15 '22 02:02 Happypig375

Here's an example of that form: Fantomas AST

And the specific AST nodes for the ForEach (in a modified JSON form for collapsibility):

Minimized AST for step-range for-each expressions
{
    "ForEach": {
        "forDebugPoint": "yes",
        "seqExprOnly": false,
        "isFromSource": true,
        "pat": {
            "Named": {
                "ident": "i",
                "isThisVal": false,
                "accessibility": null
            }
        },
        "enumExpr": {
            "IndexRange": {
                "expr1": {
                    "IndexRange": {
                        "expr1": {
                            "Const": {
                                "Int32": 0
                            }
                        },
                        "expr2": {
                            "Const": {
                                "Int32": 2
                            }
                        }
                    }
                },
                "expr2": {
                    "Const": {
                        "Int32": 100
                    }
                }
            }
        },
        "bodyExpr": {
            "YieldOrReturn": {
                "flags": [
                    true,
                    false
                ],
                "expr": {
                    "Ident": "i"
                }
            }
        }
    }
}

It seems like a ForEach member could be exposed that is provided the start/step/end as well for customized iterations?

baronfel avatar Feb 15 '22 15:02 baronfel

I think that effectively iterating a collection or a span or range would be a very neat feature for CEs.

I made a few array pool based builders and For is the only place they still allocate I believe. Such feature is definitely a big improvement for perf-oriented CEs

En3Tho avatar Jun 19 '22 09:06 En3Tho

I did a very quick spike on this on this branch: baronfel/fsharp - range-ce-member. the Range member + For member overload approach seems to work well (though as noted above we'd need a solution for steps).

I did a bit of manual testing with the following builder and CE:

builder + CE with the new members
open System
open System.Collections.Generic
open Microsoft.FSharp.Quotations

[<Struct; NoEquality; NoComparison>]
type ArrayCollector<'T> =
    [<DefaultValue(false)>]
    val mutable ResizeArray: ResizeArray<'T>

    [<DefaultValue(false)>]
    val mutable First: 'T

    [<DefaultValue(false)>]
    val mutable Second: 'T

    [<DefaultValue(false)>]
    val mutable Count: int

    member this.Add (value: 'T) = 
        match this.Count with 
        | 0 -> 
            this.Count <- 1
            this.First <- value
        | 1 -> 
            this.Count <- 2
            this.Second <- value
        | 2 ->
            let ra = ResizeArray<'T>()
            ra.Add(this.First)
            ra.Add(this.Second)
            ra.Add(value)
            this.Count <- 3
            this.ResizeArray <- ra
        | _ -> 
            this.ResizeArray.Add(value)

    member this.AddMany (values: seq<'T>) =
        if this.Count > 2 then
            this.ResizeArray.AddRange(values)
        else
            // cook a faster iterator for lists and arrays
            match values with 
            | :? ('T[]) as valuesAsArray -> 
                for v in valuesAsArray do
                   this.Add v
            | :? ('T list) as valuesAsList -> 
                for v in valuesAsList do
                   this.Add v
            | _ ->
                for v in values do
                   this.Add v

    member this.AddManyAndClose (values: seq<'T>) =
        this.AddMany(values)
        this.Close()

    member this.Close() =
        match this.Count with 
        | 0 -> Array.Empty<'T>()
        | 1 -> 
            let res = [| this.First |]
            this.Count <- 0
            this.First <- Unchecked.defaultof<_>
            res
        | 2 -> 
            let res = [| this.First; this.Second |]
            this.Count <- 0
            this.First <- Unchecked.defaultof<_>
            this.Second <- Unchecked.defaultof<_>
            res           
        | _ ->
            let res = this.ResizeArray.ToArray()
            this <- ArrayCollector<'T>()
            res

[<Struct; NoEquality; NoComparison>]
type ListBuilderCollector<'T> =
    [<DefaultValue(false)>]
    val mutable Collector: ArrayCollector<'T>

    member sm.Yield(value: 'T) = sm.Collector.Add(value)

    member sm.ToList() = sm.Collector.Close()

type ListBuilderCode<'T> = delegate of byref<ListBuilderCollector<'T>> -> unit

type ListBuilderViaCollector() =

    member inline _.Delay([<InlineIfLambda>] f: unit -> ListBuilderCode<'T>) : ListBuilderCode<'T> =
        ListBuilderCode<_>(fun sm -> (f ()).Invoke &sm)

    member inline _.Zero() : ListBuilderCode<'T> = ListBuilderCode<_>(fun _sm -> ())

    member inline _.Combine
        (
            [<InlineIfLambda>] part1: ListBuilderCode<'T>,
            [<InlineIfLambda>] part2: ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            part1.Invoke &sm
            part2.Invoke &sm)

    member inline _.While
        (
            [<InlineIfLambda>] condition: unit -> bool,
            [<InlineIfLambda>] body: ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            while condition () do
                body.Invoke &sm)

    member inline _.TryWith
        (
            [<InlineIfLambda>] body: ListBuilderCode<'T>,
            [<InlineIfLambda>] handler: exn -> ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            try
                body.Invoke &sm
            with
            | exn -> (handler exn).Invoke &sm)

    member inline _.TryFinally
        (
            [<InlineIfLambda>] body: ListBuilderCode<'T>,
            compensation: unit -> unit
        ) : ListBuilderCode<'T> =
        ListBuilderCode<_> (fun sm ->
            try
                body.Invoke &sm
            with
            | _ ->
                compensation ()
                reraise ()

            compensation ())

    member inline b.Using
        (
            disp: #IDisposable,
            [<InlineIfLambda>] body: #IDisposable -> ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        // A using statement is just a try/finally with the finally block disposing if non-null.
        b.TryFinally(
            (fun sm -> (body disp).Invoke &sm),
            (fun () ->
                if not (isNull (box disp)) then
                    disp.Dispose())
        )

    member inline b.For
        (
            sequence: seq<'TElement>,
            [<InlineIfLambda>] body: 'TElement -> ListBuilderCode<'T>
        ) : ListBuilderCode<'T> =
        b.Using(
            sequence.GetEnumerator(),
            (fun e -> b.While((fun () -> e.MoveNext()), (fun sm -> (body e.Current).Invoke &sm)))
        )
    
    member inline b.For(
        range: Range, 
        [<InlineIfLambda>] body: int -> ListBuilderCode<'T>)  =
        ListBuilderCode<_>(fun sm -> 
            for i = range.Start.Value to range.End.Value do
                (body i).Invoke &sm)

    member inline _.Yield(v: 'T) : ListBuilderCode<'T> =
        ListBuilderCode<_>(fun sm -> sm.Yield v)

    member inline b.YieldFrom(source: IEnumerable<'T>) : ListBuilderCode<'T> =
        b.For(source, (fun value -> b.Yield(value)))

    member inline _.Range(start:int, finish: int) = System.Range(Index.op_Implicit start, Index.op_Implicit finish)

    // member _.Quote(y) = y

    // member inline _.Run(code: Expr<ListBuilderCode<'T>>) = 
    //     printfn "%A" code

    member inline _.Run([<InlineIfLambda>] code: ListBuilderCode<'T>) : 'T [] =
        let mutable sm = ListBuilderCollector<'T>()
        code.Invoke &sm
        sm.ToList()


let b = ListBuilderViaCollector()

// b {
//     for x in 0..10 do 
//         printfn $"{x}"
// }

//desugars to

let bs: unit[] = 
    b.Run(
        b.Delay(fun () -> 
            b.For(
                b.Range(0, 10), 
                fun i -> 
                    printfn $"%d{i}"
                    b.Zero()
            )
        )
    )

I had to manually desugar the CE (using the commented-out quote member) to get something I could put into sharplab to see the results of.

Here's what the desugaring ends up compiling to:

range@162 = new Range(0, 10);
range@162-1 = @_.range@162;
int num = [email protected];
int value = [email protected];
if (value >= num)
{
    while (true)
    {
        object[] array = new object[1];
        array[0] = num;
        PrintfFormat<Unit, TextWriter, Unit, Unit> format = new PrintfFormat<Unit, TextWriter, Unit, Unit, int>("%d%P()", array, null);
        PrintfModule.PrintFormatLineToTextWriter(Console.Out, format);
        num++;
        if (num == value + 1)
        {
            break;
        }
    }
}
bs@196 = [email protected]();

which is a nice, compact while loop, which I think was the intent. If there's interest I'm happy to continue exploring/flesh out this design.

baronfel avatar Jun 19 '22 16:06 baronfel