datproject-discussions
datproject-discussions copied to clipboard
Investigating message abstraction layer implementation options
(NOTE This showcase is part 2b of Positioning, vision and future direction of the Dat Project)
Before reading on: These are just initial thoughts, any feedback is greatly appreciated!
Options
The preferred approach currently is to leave hypercore
alone and write the message abstraction layer on top of it. Presumably biggest concern here is:
- retain backwards compatibility, avoid breaking changes
I haven't worked with protocol-buffers, but looking at definition of schema.proto
in hypercore-protocol and holding that against the vert.x approach to messaging, I see following options:
- Frames only, with existing messages redefined as Frame Types 1a. Only a subset of existing messages are actual Frame Types
- Just add a single
Message
message to existing ones defined - Don't touch
hypercore-protocol
,hypercore
, implement on top of them
1. Everything is a Frame
In this setup:
- at root level there are only
Frame
messages - existing messages are redefined as
FrameType
-
Frame.type
indicates purpose, determinesBody
semantics
Impact (first impression):
-
hypercore-protocol
- update schema.proto (small)
- update specification / design docs (medium)
-
hypercore-messaging
(tiny satellite module)- message creation (small)
- message handling (medium)
-
hypercore
:- integrate
hypercore-messaging
(small) - refactor to deal with Frames (small)
- API extension for dealing with header, body (small)
- (optional) handle backwards-compatibility (medium)
- integrate
-
hyperdrive
(for example):- refactor to messaging, incorporate
hypercore
API changes (medium) - (i.e. file / chunk logic creates 'File' /
Chunk
messages of frame typeData
)
- refactor to messaging, incorporate
Pro's:
- messaging is natively supported, a core concept
- no backwards version-compatibility issues in
hypercore-protocol
. 2 ways to avoid:- have existing (old) message definitions at root level or import them, and just add
Frame
- have 2
.proto
files and define ahypercore.proto.messaging
package namespace in one
- have existing (old) message definitions at root level or import them, and just add
- easier to ensure / guarantee interoperability of decentralized apps
- steers implementers, broader community to best-practice approach regarding messaging
- (would be easy to write a bridge and plug into the polyglot vert.x ecosystem, gain access to the JVM)
Cons:
- not all existing messages may be good candidate frame types (see option 1a)
- backwards-compatibility still requires handling in downstream projects (best candidate is
hypercore
)
The schema.proto
may look something like this:
// add package name to discern from the old format that must still be supported for a while
package hypercore.proto.messaging
// or keep original messages at root level, retain backwards compatibility with one .proto
// alternatively the old specification format can be imported
message Fragment {
// type=0, should be the first message sent on a channel
message Feed { ... }
// type=1, overall connection handshake. should be send just after the feed message on the first channel only
message Handshake { ... }
// type=2, message indicating state changes etc.
message Info { ... }
// type=3, what do we have?
message Have { .., }
// type=4, what did we lose?
message Unhave { ... }
// type=5, what do we want? remote should start sending have messages in this range
message Want { ... }
// type=6, what don't we want anymore?
message Unwant { ... }
// type=7, ask for data
message Request { ... }
// type=8, cancel a request
message Cancel { ... }
// type=9, get some data
message Data { ... }
enum FrameType {
Feed = 0; // the first message, also default enum value
Handshake = 1;
Info = 2;
Have = 3;
Unhave = 4;
Want = 5;
Unwant = 6;
Request = 7;
Cancel = 8;
Data = 9;
}
required FrameType type;
// either define a single header format, or support multiple alternatives in 'oneof' construct, e.g.
//
// - DatDefaultHeaderFormat (default format holding only dat-supported attributes)
// - KeyValueHeaderFormat (user-extensible map of header attributes)
// - CustomHeaderFormat (e.g. community-contributed JsonSchemaHF, JsonLdHF, etc.)
message Header { ... }
// probably include some more Frame fields here
// the body payload that depends on the frame type
oneof Body {
Feed = 0;
Handshake = 1;
Info = 2;
Have = 3;
Unhave = 4;
Want = 5;
Unwant = 6;
Request = 7;
Cancel = 8;
Data = 9; // maybe rename to Message, or Payload
}
}
Notes:
- field changes wrt current messages may make sense (e.g. promoting to Frame level, removing)
-
Data.value
would be where the message body is (may be defined as typeAny
) -
Body
payload field layouts must be unique for each frame type foroneof
to work (presumably)
1a - Some Frame Types, some message types
Option 1 may be a very naive design, as it assumes all current message types are natural candidate Frame Types, however:
- Some (or all) might be implemented as message types instead using frame type
Data
- E.g.
Handshake
,Info
,Cancel
- Maybe some (or all) are not suitable to serve as frame type
Looking at vert.x messaging they only have 4 types:
-
send
to send a message to anaddress
-
publish
to publish a message to anaddress
-
register
to subscribe to the messages sent or published to anaddress
-
unregister
to unsubscribe to the messages sent or published to anaddress
Looking at this, vert.x slices it completely different than current hypercore-protocol I need more time studying Dat inner-workings to say anything sensible here, your feedback can help!
First thoughts:
- having an
address
at frame level like vert.x may obviate the need forHandshake
- an address need not be directional (dat-url), it can be a topic to which you can pub / sub
- handshake information can be placed in any / every
Frame
by means of theHeader
- self-contained frames make protocol more robust, e.g. in handing broken pipes, network issues
[TODO What would be missing if adopting the vert.x way with only the 4 frame types?]
Impact:
- same as option 1, except additional effort downstream for each frame type removed / abstracted away
Pro's / Con's:
- same as option 1
- simplified protocol, more flexibility
Add a Message to the mix
In this option the schema stays as it is now, with the only additional a Message
message type.
The message would have a Header and Body and some other fields, just like Frame
Pro's / Con's / Impact:
- similar to option 1
- more moving parts, less consistency in protocol
- potentially more handling downstream, less interoperability
3 - Layered on top of hypercore
Currently this option is favoured by both @mafintosh and @joehand But to me this seems to be the approach with most downsides
In this setup:
- both
hypercore-protocol
andhypercore
remain untouched -
hypercore-messaging
satellite module provides message creation + handling logic- module design virtually identical to the one described in option 1
-
hypercore-messaging
is incorporated by downstream modules
Pro's:
- freedom, use messaging or do not
- (I can't think of more pro's)
Cons:
- messaging is not central concept of Dat, but an addon
- (some) handling logic must be duplicated in all downstream modules
- easier to make mistakes, incorporate messaging incorrectly
- freedom leads to fragmentation, incompatible application designs
- requires more effort, support to guide and steer the community
- adoption of messaging layer in the ecosystem may be much slower
--
Previous part: Design of message-based abstraction layer on top of hypercore
Next part: Optimizing traction and exposure