distributed-process
distributed-process copied to clipboard
Improve efficiency of local message passing
I see no reason why we should copy data that's being passed between two processes on the same (local) node. So I've started experimenting with skipping some of the overhead of NT by sending directly via the node controller using sendCtrlMsg instead.
Initial commit is here. With this branch installed locally, the distributed-process-platform tests still pass, but I have seen the test run bomb out with 130 exit code once, which is a bit worrying.
Following on from that commit, I'd like to see if we can skip the serialization step and just enqueue the data directly instead of creating a new Message in which to pass it. This shouldn't be too hard, but the matching operations will still need the type fingerprint in order to handle selective receive so I might create a type class that encapsulates the fingerprint and access to the payload, which we can use in the matching code. We can then have an instance for Message that calls decode and another for Serializable a that just returns the enclosed data directly.
Note that sending must have the same strictness properties when sending locally as when sending remotely, so I think serialization will be necessary, although deserialization can be skipped (as long as we document the assumption that encode . decode should be the identity).
Note that sending must have the same strictness properties when sending locally as when sending remotely
Naive question, but can we not fake that somehow without forcing serialization? It seems a massive overhead. I could not find anything suggestive of strictness constraints in "Towards Haskell ..." so I'm assuming that this has to do with the strictness properties of ByteString and its storage of vectors (as strict Word8 arrays of bytes) or does this rather arise from the call to Data.Binary.encode?
How do the strictness properties of sending affect the semantics here? Presumably we need to ensure that send is evaluated strictly (so that the programmer knows this operation will not be deferred) but is serialization really the only way to guarantee that? There must be something we could do to provide the same semantics without actually converting the passed data structure to bytes.
Serializing most (but not necessarily all) data structures will force them to be fully evaluated (unless the corresponding 'encode' function skips part of the data structure). So, in order to maintain the semantics between sending messages remotely and locally you'd have to do the same thing. If you want to maintain this semantics precisely I don't see that you have an option not to serialize; how else are you going to get the same semantics? You could add an NFData construct and use deepSeq, but now you're changing the CH API, and moreover, there is no guarantee that the deepSeq will have the same affect as encode.
Whether or not preserving this semantics is truly important is a separate issue.
Serializing most (but not necessarily all) data structures will force them to be fully evaluated
Of course. Sorry, that's my ML addled brain not fully thinking in terms of laziness yet - I'll get there eventually.
You could add an NFData construct and use deepSeq, but now you're changing the CH API, and moreover, there is > no guarantee that the deepSeq will have the same affect as encode.
That doesn't sound like the way to go, unless you tie that to identity. Even then I really don't like the idea of changing the top level API in this way. It feels way to invasive and creates a bigger conceptual overhead for people to grok before they can work productively.
Whether or not preserving this semantics is truly important is a separate issue.
Indeed. When you're sending locally, I can't see why this matters much in practice. If we simply pass a pointer to the data structure being sent, then it will be fully evaluated when the consumer forces this. Evaluation could presumably have a space/time effect in the consuming process' code. We do know that no unexpected side effects will take place though don't we.
In my opinion this is a reasonable trade-off that could simply be documented in the API so that it is apparent to consumers. This would force the programmer to decide whether or not they should forcefully evaluate the whole data structure before sending, or suck up the consequences otherwise.
Another alternative, I suppose, would be to provide sendLocal as a different/alternative primitive. Personally I don't really like that idea as the location transparency of Erlang's ! operator is a huge benefit to the programmer and I'd be reluctant to loose that.
... Or ... we could provide an alternative sendLocal that does serialize, for those who wish to deliberately do this. I'm struggling to imagine who would choose to use such a primitive or why, which seems to indicate that from a developers perspective, serializing local sends by default is not the intuitive behaviour one would expect.
Another alternative, I suppose, would be to provide sendLocal as a different/alternative primitive. Personally I don't really like that idea as the location transparency of Erlang's ! operator is a huge benefit to the programmer and I'd be reluctant to loose that.
The thing is that if you don't serialize the data structure you don't have local transparency, because the program might alter it's behaviour quite significantly if data structures get forced or not. Indeed, you already say this: "This would force the programmer to decide whether or not they should forcefully evaluate the whole data structure before sending" -- which is not something that's currently required and something's that's never required when the data is serialized (and we have no way around that). In other words, if you want the local transparency, you need to do the same thing in both cases.
In serious applications it is important to control when data structures get forced; it seems rather dangerous to have this depend on network topology.
I honestly don't know what the right approach is here.
In serious applications it is important to control when data structures get forced; it seems rather dangerous to have this depend on network topology.
I honestly don't know what the right approach is here.
Hmn, this is an aspect of laziness that I hadn't fully considered. Because of this consideration, I guess that by default we should make sure that local sends go through serialization, though as @simonmar points out we don't actually have to deserialize. I'm guessing what he means by this is that we only have to call encode but we do not need to actually use the resulting ByteString - so we can just pass the original pointer we received because the call to encode will have walked the whole structure and forced any thunks to be evaluated.
Now in terms of actually passing a Serializable a => a instead of a ByteString I was experimenting and I came up with this so far.
data LocalMessage :: * where
LocalMessage :: forall a. Serializable a => a -> LocalMessage
class Deliverable a where
typeFingerprint :: a -> Fingerprint
payload :: (Serializable b) => a -> Maybe b
instance Deliverable Message where
typeFingerprint = messageFingerprint
payload = reify
reify :: forall a. Serializable a => Message -> Maybe a
reify m =
case messageFingerprint m == fingerprint (undefined :: a) of
True -> Just decoded
_ -> Nothing
where decoded :: a
!decoded = decode (messageEncoding m)
instance Deliverable LocalMessage where
typeFingerprint (LocalMessage m) = fingerprint m
payload = reifyLocal
reifyLocal :: forall a. Serializable a => LocalMessage -> Maybe a
reifyLocal (LocalMessage m) =
case fingerprint m == fingerprint (undefined :: a) of
True -> Just unpack
_ -> Nothing
where unpack :: a
!unpack = m
Presumably I'd have to add a call to encode in reifyLocal - that's easy enough.
I can't seem to figure out how to make the existential work so that the Typeable comes into scope. My plan was to change node to initialize CQueue to handle either Message or LocalMessage and use the Deliverable type class in the match implementations. Does that sound like a reasonable approach? And how can I make a concrete type of CQueue a when all I know about the type parameter is that it's either Message or LocalMessage - the latter being an existential? I'm afraid my understanding of existentials is coming up short here, which is frustrating because I can see how to implement this otherwise.
You like your type classes, don't you :-) I would consider changing Message itself.
data Message = EncodedMessage Fingerprint ByteString | forall a. Typeable a => UnencodedMessage a
or something along those lines.
Oooh - that's cool. I had no idea that was even possible. Don't really know much about existentials yet, so working on CH is turning out to be a bit of a baptism by fire. :)
I'll have a go at using that construct instead - it's much nicer than what I was attempting.
Cheers!
I'm guessing what he means by this is that we only have to call encode but we do not need to actually use the resulting ByteString
Yes, but remember that you will need to force that ByteString or still nothing happens :)
Yes, but remember that you will need to force that ByteString or still nothing happens :)
Cripes yes, I would've forgotten that if you hadn't said something. Cheers! :D
And since this is a lazy bytestring, you will need to force the entire thing. Something like
let encoded = encode a in length encoded `seq` UnencodedMessage a
or something like that. This is the hardest part about writing serious Haskell applications -- I don't want to tell you how many subtle laziness bugs I've had to fix in Cloud Haskell so far, or you might lose all confidence in me :)
I don't want to tell you how many subtle laziness bugs I've had to fix in Cloud Haskell so far, or you might lose all confidence in me :)
He he - I'm just hoping to make sure none of my contributions cause any major problems! I think as we increase the test coverage we'll get more confidence about semantics affecting bugs, though whether we remain space/time efficient will require manual benchmarking I guess.
Thanks for the pointer anyway - I'll go with your suggestion for now. If there any way that in a test case, using GHC APIs we can determine whether or not something is fully evaluated on receipt?
Note to self: need to update postAsMessage and other auxiliary capabilities in Node.hs before merging this.
Hmn, there is some unpleasantness to deal with when implementing this. The channel send operations would also have to be modified in order to maintain the stronger ordering semantics we want. This probably needs a bit more thought.