swift-async-algorithms icon indicating copy to clipboard operation
swift-async-algorithms copied to clipboard

Remove `AsyncIterator: Sendable` requirement from merge

Open FranzBusch opened this issue 3 years ago • 4 comments
trafficstars

Motivation

Currently a lot of the operator implementations in here that consume other AsyncSequences require the AsyncIterator to be Sendable. This is mostly due to the fact that we are calling makeAsyncIterator on the upstream AsyncSequence and then pass that iterator around to various newly spawned Tasks. This has two downsides:

  1. It only allows users to use operators like merge if their AsyncSequence.AsyncIterator is Sendable
  2. In merge we are creating new Tasks for every new demand. Creating Tasks is not cheap.

My main goal of this PR was to remove the Sendable constraint from merge.

Modification

This PR overhauls the complete inner workings of the AsyncMerge2Sequence. It does a couple of things:

  1. The main change is that instead of creating new Tasks for every demand, we are creating one Task when the AsyncIterator is created. This task has as child task for every upstream sequence.
  2. When calling next we are signalling the child tasks to demand from the upstream
  3. A new state machine that is synchronizing the various concurrent operations that can happen
  4. Handling cancellation since we are creating a bunch of continuations.

Result

In the end, this PR swaps the implementation of AsyncMerge2Sequence and drops the Sendable constraint and passes all tests. Furthermore, on my local performance testing I saw up 50% speed increase in throughput.

Open points

  1. I need to make this sequence re-throwing but before going down that rabbit whole I wanna get buy-in on the implementation.
  2. We should discuss and document if merge and other operators are hot or cold, i.e. if they only request if they got downstream demand
  3. I need to switch AsyncMerge3Sequence over to the same iplementation

FranzBusch avatar Aug 10 '22 11:08 FranzBusch

Switching this from CheckedContinuation to UnsafeContinuation is improving the performance by a lot:

Current main: 1x baseline This PR with Checked: 1.5x This PR with Unsafe: 3.5x

FranzBusch avatar Aug 10 '22 12:08 FranzBusch

I'm out on vacation this week, but a few key points - merge (like all other algorithms) should only request the next value upon need/demand, I had a similar but stale branch of collapsing the tasks to gain some perf but the problem became the collapse versus the rethrow (long/short of it was that the pseudo cancellation of a throw put a gnarly wrench in the works for the collapse part). All that being said; I think it can be done - just hadn't gotten around to it just yet, there are a number of related algorithms that are effectively a composition of merge that we could improve so this is definitely something we should dig into.

phausler avatar Aug 10 '22 13:08 phausler

@phausler Yeah no rush on this! Enjoy your vacation and we can talk afterwards. I just wanted to put this up for discussion. I agree that re throwing is kinda coming the way but I am seeing up to 3.5 times performance improvements with this. However, performance is no the main driver, it really is the Sendable annotation. Let's discuss this once you are back!

FranzBusch avatar Aug 10 '22 15:08 FranzBusch

One required item for merging this: it needs to definitely handle rethrowing cases (and we should validate that is true)

phausler avatar Aug 30 '22 17:08 phausler

@phausler I just updated the PR again. To summarise what changed from your initial review:

  • My implementation supports rethrowing now
  • I split up the implementation into separate files to aid with readability
  • I swapped merge3 to also make use of the new state machine
  • I renamed the internal upstreamThrew case to upstreamFailure
  • I added more tests validating various edge cases
  • Aligned precondition messages

I replied to some of your feedback inline but I haven't changed the following:

  • Use ManagedCriticalState: I can't adopt this since I need to be able to manually lock and unlock in the case when I need to suspend on next()
  • Remove preconditions: I went through the code again and checked these preconditions and I wasn't able to come up with a test to break them. There might still be an edge case I haven't thought about, but I would rather let this crash then to transition into an unknown state.
  • Line breaks: I find them easier to read but if you want me to use a single line I am happy to change (maybe it would be good to get swift format setup in this repo?)

FranzBusch avatar Sep 09 '22 14:09 FranzBusch

@swift-ci please test

phausler avatar Sep 13 '22 20:09 phausler

@phausler Just rebased this PR. Feel free to merge it if you are happy with it!

FranzBusch avatar Oct 05 '22 13:10 FranzBusch