fslang-suggestions icon indicating copy to clipboard operation
fslang-suggestions copied to clipboard

Support `and!` for task based CE

Open Lanayx opened this issue 1 year ago • 18 comments

I propose we add support and! for task based CE

The use case is the following - we need to load several pieces of data that are not interdependent to combine them and use further on in program. Rather than doing those calls sequentially one after another it is beneficial to perform those calls concurrently and continue when all the pieces are loaded - performance optimization. Actually, this is even covered in documentation although not implemented https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/computation-expressions#and

So basically it would be possible to simplify this code

task {
    let tOne= getData1()
    let tTwo= getData2()
    let! results = Task.WhenAll([| tOne; tTwo|])
    let one = tOne.Result
    let two = tOne.Result
    ...
}

into this code

task {
    let! one = getData1()
    and! two = getData2()
    ...
}

More examples

The existing way of approaching this problem in F# is

  1. the first snippet from above
  2. starting all tasks first and them awaiting them one by one
task {
    let tOne = getData1()
    let tTwo = getData2()
    let! one = tOne
    let! two = tTwo
    ...
}

Pros and Cons

The advantages of making this adjustment to F# are clear and concise way to concurrently await multiple tasks

The disadvantages of making this adjustment to F# are i don't see disadvantages

Extra information

Estimated cost (XS, S, M, L, XL, XXL): S

Related suggestions: (put links to related suggestions here)

Affidavit (please submit!)

Please tick these items by placing a cross in the box:

  • [x] This is not a question (e.g. like one you might ask on StackOverflow) and I have searched StackOverflow for discussions of this issue
  • [x] This is a language change and not purely a tooling change (e.g. compiler bug, editor support, warning/error messages, new warning, non-breaking optimisation) belonging to the compiler and tooling repository
  • [x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it
  • [x] I have searched both open and closed suggestions on this site and believe this is not a duplicate

Please tick all that apply:

  • [x] This is not a breaking change to the F# language design
  • [x] I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the :+1: emoji on this issue. These counts are used to generally order the suggestions by engagement.

Lanayx avatar May 21 '24 15:05 Lanayx

More discussion (for async, but similar ideas apply):

  • #983
  • https://github.com/dotnet/fsharp/discussions/11043
  • https://github.com/TheAngryByrd/IcedTasks/pull/7

brianrourkeboll avatar May 21 '24 19:05 brianrourkeboll

Why not just write:

task {
    let! one = getData1()
    let! two = getData2()
    ...
}

There's no need to first assign a task to a variable and then let-bang bind it. You can do that in one go.

clear and concise way to deal with launching parallel tasks

Maybe start with use-case like that? It's not quite clear from your main description, right now ;). But before you update anything, iirc, this has already been decided a while ago. Let me see if I can find the original suggestion.

EDIT, I see that @brianrourkeboll beat me to it. Specifically, see this comment: https://github.com/dotnet/fsharp/discussions/11043#discussioncomment-345815

abelbraaksma avatar May 21 '24 19:05 abelbraaksma

@brianrourkeboll I specifically didn't mentioned async, since it's much harder problem to deal with and and! doesn't fit very natural there, while for task it does

Lanayx avatar May 21 '24 19:05 Lanayx

@abelbraaksma The use case is very common - to collect data from several sources before doing some business logic with the combined object, while those calls are not dependent one on another. So basically a performance optimization, no need to wait for callbacks sequentially

Lanayx avatar May 21 '24 19:05 Lanayx

Your example originally hot-started the tasks before the let-bang. After doing that, they cannot be parallelized anymore, depending how they were started (they may have started a child task, though, but that's not under your control). With Async (always cold-started) scheduling parallel tasks is a breeze, but with tasks (as used in F# through task {...}) it is not, because of them already potentially been started.

Not saying that it cannot be done, but there are a whole bunch of intricacies involved, and that's on top of the ones for the (far more natural) async (but that one is available in IcedTasks, it's a cool lib!).

abelbraaksma avatar May 21 '24 19:05 abelbraaksma

The use case is very common - to collect data from several sources before doing some business logic with the combined object,

Yes, it is indeed a common use case. I was merely saying that it was not clear from your example (there's nothing in there that's parallel and the text doesn't mention it). Using tasks without awaiting them is (probably) not what you are after. And hot tasks are not meant for parallelization (they are already started, after all).

The common use case has been recognized in many ways, and recently (.NET 9) has added a parallel for-each, for instance. I'm not saying we shouldn't improve (I'd love to have more parallelism support), but I don't think this approach is feasible given the current design of task (but I'm happy to be proven wrong though).

abelbraaksma avatar May 21 '24 19:05 abelbraaksma

@abelbraaksma Thanks for clarifying that, I've updated the description, probably the word parallel made a confusion

Lanayx avatar May 21 '24 19:05 Lanayx

Tx, but actually, reading "parallel" was the first time I understood what you were after (or maybe I still misunderstand, not sure now). Your example doesn't show a use-case (it's just not something you should do). Ideally, your code (and the description and intro of that code) would explain your use-case. There's nothing in your current code that requires and to be needed.

The better that example shows what you want (and why it doesn't currently work), the more people can upvote it. They won't upvote what they won't get.

abelbraaksma avatar May 21 '24 20:05 abelbraaksma

Updated the description again, hopefully it makes sense now :)

Lanayx avatar May 21 '24 20:05 Lanayx

Much better! ❤️

abelbraaksma avatar May 22 '24 18:05 abelbraaksma

Custom extension will work, however built-in implementation is beneficial:

type TaskBuilder with
    member this.MergeSources(task1: Task<'a>, task2: Task<'b>) =
        task {
            let! _ = Task.WhenAll(task1, task2)
            return (task1.Result, task2.Result)
        }

Lanayx avatar May 26 '24 00:05 Lanayx

yeah, that'll work a little bit. But keep in mind that F# tasks are hot. That means that task2 (or task1) may already have finished before calling Task.WhenAll and they cannot be scheduled on a different thread anymore. This may or may not be what you want. It won't introduce parallelism. But if it's already there, it'll work as expected.

From your original description, I think you want Async instead, so you can actually control when and how they are run, and control that they are run in parallel.

abelbraaksma avatar May 26 '24 17:05 abelbraaksma

@abelbraaksma Actually I didn't want to control where and how tasks run (only that they can be started independently), because tasks are hot and as user I only care about their completion, since I can safely assume they are already started. That's the reason why it's so natural to do this for tasks and not for asyncs.

Lanayx avatar May 26 '24 23:05 Lanayx

That's the reason why it's so natural to do this for tasks and not for asyncs.

Maybe I am missing something here, or just totally misunderstood your suggestion. Once a task starts running, you don't want to force them to run in parallel (which your original code does). But now you way that you do not want to control how tasks run, except that that's precisely what your proposed code is doing?

Sorry if I misunderstood. I'm just really confused. Either you await tasks, or you don't, but you (typically) do not want to have them run together, unless you know that parallel running is safe, and that really depends on what your code is doing.

only that they can be started independently

I don't understand this line. They are started when they are encountered by the JIT. This is why, for instance, a List<task> is such a bad idea (each task would start immediately, running concurrently), and why, in the general case, you do not want to have let myFunc (t1: task<_>) (t2: task<_>), unless you do this for delayed tasks (again, because they would run concurrently).

Do you, by any chance, just mean to use Array.Parallel or Async.Parallel? These are meant to do what you appear to describe.

Perhaps, for illustration, could you show a practical use-case? It may help me (or others) help understand what is suggested here. I don't want to dismiss a good suggestion on the grounds of me not understanding it ;).

abelbraaksma avatar May 27 '24 17:05 abelbraaksma

@abelbraaksma sorry to confuse you again, this feels so simple to me that I can't explain it in simple words. Maybe the code will speak clearer Case1:

open System.IO
let joinTwoFiles fileName1 fileName2 =
    task {
      let! content1 = File.ReadAllLinesAsync fileName1
      and! content2 = File.ReadAllLinesAsync fileName2
      return content1 + content2
    }

Case2:

let getBestPrice provider1 provider2 =
    task {
      let! prices1 = httpClient.GetFromJsonAsync<Price>(provider1)
      and! prices2 = httpClient.GetFromJsonAsync<Price>(provider2)
      return Math.Max(prices1.Value, prices2.Value)
    }

Case3:

let loadFullConfiguration getDbConfig getFileConfig =
    task {
      let! dbConfig = getDbConfig()
      and! fileConfig= getFileConfig()
      return { DbConfig = dbConfig; FileConfig = fileConfig }
    }

As you see, I'm not specifying how or where to start tasks, just calling methods returning them and awaiting the results independently (concurrently)

Lanayx avatar May 27 '24 22:05 Lanayx

Thanks. It does make your intent clearer. Imo, this should be governed by a user extension, as this is rather scary to open up. Besides, there are other suggestions with and! and tasks/async that appear to want to use it for parallelism. It can get really confusing really quickly.

But I do see the merit in the applicative nature of this proposal. It is more of a natural fit than parallelism, I guess. But people might expect this to run in parallel, or out of order, potentially breaking their app (which is why I think a user extension is the better solution: each project has its own specific use-cases for this).

But I may be completely wrong about where this should land, of course. Let others please chime in on what their ideas are, after all, this is community driven :).

PS: Just a tip: I'd advise you to update your original post regularly. People won't scroll down to read everything. Specifically your last post really shows what you actually want and may lead more people to vote for this: just replace your current examples with these for clarity.

abelbraaksma avatar Jun 01 '24 16:06 abelbraaksma

FWIW: IcedTasks does have this which you can use as a polyfill

TheAngryByrd avatar Jun 07 '24 13:06 TheAngryByrd

@abelbraaksma I think the ergonomics of having no default implementation for and! are really really awful. I would prefer a less good implementation that actually exists than not having access to one at all. This also goes for how result ce doesn't have an and! to gather up a list of errors. I'm not attached to a specific implementation, but it absolutely should exist. I don't think each project has specific use-cases for this and I think you're vastly overestimating the competency with building CE's of the average F# user.

Most consumers will not use and! it at all because it has no implementation. You can tell by going to any number of projects. In a some cases people will use @TheAngryByrd 's IcedTasks, because the implementation actually exists. I don't think people value IcedTasks enough precisely because the lack of a default implementation means people never have an opportunity to see what value it could provide.

voronoipotato avatar Jul 24 '24 18:07 voronoipotato

This proposal is what most people expect from an and! and it was also part of motivating examples when and! support is being described.

The actual implementation of task{} is duck-typed for TaskLike types, so a mere Task.WhenAll call is not sufficient. The implementation in IcedTasks seems the best fitting so far.

T-Gro avatar Jan 20 '25 15:01 T-Gro

@T-Gro given that API is already present and only TaskLike implementation is missing, does this need an RFC?

Lanayx avatar Apr 01 '25 00:04 Lanayx

For this suggestion, what matters more are the specific implementation details (e.g. to make it work for task-like and not just System.Threading.Task). It is fine to delay an RFC until (if) those questions arise from tackling the implementation.

After it is implemented, we should also have text with user guidance and examples - e.g. as diff for new text going into docs and use that for a mini RFC PR.

T-Gro avatar Apr 01 '25 09:04 T-Gro

Just to add, I have found edges with and! support that causes the "FS3511] This state machine is not statically compilable" warning. May not hit it here, but it might need to be addressed.

TheAngryByrd avatar Apr 01 '25 16:04 TheAngryByrd

I think it may need to be addressed but in the backlog sense, similar to how other CE operators have this issue at times. Important, but not urgent. I think it would deliver a lot of value even if there are scenarios where you can't use it.

voronoipotato avatar Apr 10 '25 15:04 voronoipotato

Closing the suggestion as it is now available in .NET 10 Preview 5

Lanayx avatar Jun 12 '25 20:06 Lanayx

I learned about this just a few days ago. I know it might be too late now, but I still want to express my concerns here.

Almost the same arguments expressed by myself and @dsyme as in the similar proposal for async apply here as well; we are moving towards implicit parallelism and, more importantly, a change in semantics. This goes against the spirit of the original RFC. Note that the very first two words of that RFC are “more efficient” and for good reasons. That is exactly how and! should be understood: as a more efficient way to achieve the same result as with let! bindings, without changing semantics.

Regarding arguments about pragmatism, ergonomics, fewer keystrokes, and so on: there is a straightforward and equally pragmatic solution, namely to add a separate CE for that, call it ptask or whatever. Think about it: what actual value does it add to mix these operations into the same CE?

We don’t really need to mix these operations with regular monadic computations, do we? And note that the original RFC itself provides an example of exactly this approach, using a separate CE, that is the way it should be done.

My additional concern is that by introducing this change, we may be starting down a slippery slope. In this case, the semantic change is relatively subtle, but it would set a precedent, effectively giving a green light to further departures from the original design. Remember that whatever makes it into the core is generally treated as best practice. That is why I could also agree with @abelbraaksma’s proposal to do this in a separate library. But as I said, even adding it into the core would be fine as long as it is not added into the task CE itself.

Having said that, note that it's possible (and makes sense) to add an and! keyword in the same task CE, which although might end up running tasks in parallel in most environments (due to the hot-async property of them) it will still short circuit as expected, because as a proper derived applicative it will retain the sequential nature of the original monad. So, this would make sense because it won't mix different semantics in the same CE.

gusty avatar Sep 19 '25 16:09 gusty

@gusty I don't see any semantics change, can you explain? and! in task CE doesn't run anything in parallel, rather it provides more efficient way to achive the same result as with let! bindings. The fact that tasks that it receives could already be started (actually they are in most cases) doesn't imply any parallel semantics to and!

Lanayx avatar Sep 19 '25 19:09 Lanayx

Thanks @Lanayx if that's the end goal (and forgive me if I misread the original intention) then it should be ok, actually it sounds like the case I mention in my last paragraph.

However the actual implementation goes beyond a simple optimization and changes a bit the semantics. I just did a quick test, let me share the results here:

open System.Threading
open System.Threading.Tasks

type TaskBuilder with
    member this.MergeSources(task1: Task<'a>, task2: Task<'b>) =
        task {
            let! _ = Task.WhenAll(task1, task2)
            return (task1.Result, task2.Result)
        }

exception TestException of string

let createTask isFailed delay value =
    if not isFailed && delay = 0 then Task.FromResult value
    else
        let tcs = TaskCompletionSource<_> ()
        if delay = 0 then tcs.SetException (TestException (sprintf "Ouch, can't create: %A" value ))
        else (Task.Delay delay).ContinueWith (fun _ ->
            if isFailed then tcs.SetException (TestException (sprintf "Ouch, can't create: %A" value )) else tcs.SetResult value) |> ignore
        tcs.Task

let a = task {
    let! x = createTask true  20 1
    let! y = createTask true  10 2
    return x + y
}

let d = task {
    let! x = createTask true  20 1
    and! y = createTask true  10 2
    return x + y
}

// a.Exception.InnerExceptions is seq [TestException "Ouch, can't create: 1"]
// d.Exception.InnerExceptions is seq [TestException "Ouch, can't create: 2"]

as you can see results change a bit and no matter how one can justify them, it departs from a simple optimization.

As a side note, in the F#+ library there are 2 definitions for task applicatives, one is the derived from the original monad with code optimized which is based on Task.lift2 , and it has extensive testing to prove nothing changes, then there is the non-monadic-derived one, which is based in Task.map2 so it allows them to finish in different order and collects all errors.

So if we add those implementations to the above code it gives you

...
open FSharpPlus

let a = task {
    let! x = createTask true 20 1
    let! y = createTask true  10 2
    return x + y
}

let b = applicative { // based on Task.lift2
    let! x = createTask true 20 1
    and! y = createTask true 10 2
    return x + y
}

let c = applicative' { // based on Task.map2
    let! x = createTask true 20 1
    and! y = createTask true 10 2
    return x + y
}

let d = task {
    let! x = createTask true 20 1
    and! y = createTask true 10 2
    return x + y
}

// a.Exception.InnerExceptions is seq [TestException "Ouch, can't create: 1"]  
// b.Exception.InnerExceptions is seq [TestException "Ouch, can't create: 1"] exactly the same as a 
// c.Exception.InnerExceptions is seq [TestException "Ouch, can't create: 1"; TestException "Ouch, can't create: 2"]
// d.Exception.InnerExceptions is seq [TestException "Ouch, can't create: 2"]

And if you dive into the source code and the tests code, you'll find there's a lot of attention to details, in order not to give surprising results, although the code is not the most optimal as there's still more room for improvements.

gusty avatar Sep 19 '25 21:09 gusty

@gusty thanks a lot for the examples, they make your comment much clearer, but are you only concerned about this single case where both branches fail with exception? I agree, that I didn't pay enough attention to this case, so it works like it works now "by accident", so if we wanted, we could change this behavior. And I actually I can't reproduce it, here is my setup

open System.Threading.Tasks

let runTask (delay: int) msg =
    task {
        do! Task.Delay delay
        failwith msg
        return 1
    }

[<EntryPoint>]
let main argv =
    (task {
        let! x = runTask 5000 "task 1"
        and! y = runTask 1 "task 2"
        return x + y
    }).Wait()
    0

And output:

Unhandled exception. System.AggregateException: One or more errors occurred. (task 1)
 ---> System.Exception: task 1

Lanayx avatar Sep 19 '25 22:09 Lanayx

@Lanayx you are right, I used your code (with the extension) to test it, but now I tested with the latest 10 preview and it runs as expected.

I still didn't had a look at the implementation, but at first sight it looks like it does the right thing.

I just edited my repro in order to reflect the implementation used to test. And yes, exceptions are useful to show the way both task are shortcircuit-ing.

gusty avatar Sep 24 '25 07:09 gusty