parallel
parallel copied to clipboard
Generalize
Generalize most functions from Strategy a to a new Strategy' a b. This didn't actually require any changes to terms, but it gives a much better fusion story. Strategy' is a profunctor (Kleisli Eval) as well as a semigroupoid under dot. Moreover, the extra parametricity makes it easier to see that strategies are implemented correctly—there's no way to forget to apply a Strategy' if you need its result.
Closes #27
I'm not convinced by this. Strategies are supposed to do evaluation only, and it's pretty clear that a strategy that returns a different type is doing more than evaluation!
Moreover, a change this fundamental would break all the Straegies examples in my book. That doesn't necessarily mean we shouldn't ever evolve the API if we have good reason to, I'm just not seeing a strong motivation here.
I'm sympathetic to your claim that strategies are only supposed to do evaluation. However, I think you closed off that possibility when you made Strategy a type synonym rather than a newtype. More to the point, I don't personally think that's a really good justification for a restriction that makes (manually) fusing operations so hard. That seems awfully hard to justify when I can, today, without doing anything strange, write a Strategy that does more than evaluation! Could you explain how this change "breaks" examples in your book? I believe they should all work precisely as they did before.
A conservative approach: move Control.Parallel.Strategies to Control.Parallel.Strategies.FillInTheBlank, then add a module of monomorphizing functions as Control.Parallel.Strategies.
This is a very old library with some historical decisions that we would probably make differently today. Strategy being a type synonym rather than a newtype is one of those (I think newtype probably didn't even exist when the library was first written).
Perhaps this change wouldn't break code, but it breaks the docs - the types in the docs now don't match the types in the book, which is likely to cause confusion. Especially because it changes things in a seemingly fundamental way - now a Strategy doesn't have to return the original value, it can return something different. I'm not even sure I understand the impact of that change on how to write parallel code - the point of a Strategy returning the original object is so that the sparks don't get garbage-collected. How does that work if you return a different value?
That should work just fine with spark#, as long as you eventually use the result.
Strategy' a b is "a way to compute a b from an a without any mutation". It's really a lot like a -> b, but it can have explicit computational staging and parallelism. I think it's actually easier to avoid mistakes. Arbitrary functions can be stuffed into the Strategy' a b, so an entire computation can be written using a single withStrategy call. There's no need to exit the strategic context and risk re-entering with the wrong copy of a value.
I wonder if it might be helpful for us to discuss this question in real-time. Do you think you'll have time to do so in the next week or two?
I think I understand your point that there are useful strategies you can write with this that you couldn't before, and that some things work out more nicely. However, there are also more bad strategies that you can write with this that you couldn't before, e.g. strategies that spark things and then return (). When we changed the form of strategies from a -> () functions to a -> a functions it was to make it possible to return the sparked closures so that the sparks don't get GC'd, see the paper for details.
The whole question of correctness is messy because it's clearly possible to screw up with both the current formulation and your generalized version. However, I do think that the a -> a formulation best captures the spirit of what strategies are intended to be: functions that traverse and evaluate/spark data structures only. (having said that I personally think that strategies are not the best way to express parallelism because they require the programmer to understand too much about evaluation order, but that's orthogonal to this discussion)
I'd be OK with adding another module for your generalised API so long as we can keep the existing Control.Strategies API the same, with the same types. Would that be OK? Happy to discuss this in person next week if you want.
I'm totally okay with using a separate module for the underlying more-polymorphic version. Unfortunately, it's quite easy to goof up regardless:
do
strat a
return a
is an ever-present concern. Perhaps linear types will help eventually.
I'm not convinced by this. Strategies are supposed to do evaluation only, and it's pretty clear that a strategy that returns a different type is doing more than evaluation!
Moreover, a change this fundamental would break all the Straegies examples in my book. That doesn't necessarily mean we shouldn't ever evolve the API if we have good reason to, I'm just not seeing a strong motivation here.
I agree with Simon here.
Do you perhaps have an example where this would be benefitial?