devp2p icon indicating copy to clipboard operation
devp2p copied to clipboard

ABNF packets

Open decanus opened this issue 5 years ago • 7 comments

At vac we've started defining packets using ABNF, it might make sense to do this in the devp2p specifications too. https://specs.vac.dev/specs/waku/waku.html#abnf-specification

decanus avatar Feb 28 '20 14:02 decanus

Yes! That's a very nice idea. We should use that for all specs. Maybe we could even define a nice RLP extension for ABNF and add it as a meta spec.

fjl avatar Mar 03 '20 14:03 fjl

@fjl would be fun to work on that, am happy to help. I made the demo PR mainly to show.

decanus avatar Mar 03 '20 17:03 decanus

currently wondering if it makes sense at all for RLP or only for packets.

decanus avatar Mar 06 '20 22:03 decanus

If there is a neat way to describe RLP with ABNF, let's go for it. Right now we use [ x, y, ... ] notation for RLP lists, and the square brackets mean recursive encoding. This notation works most of the time, but cannot describe the cases where we want to concatenate multiple RLP-encoded values. I used notation like rlp_bytes(x) for that (see here), but it doesn't look nice.

fjl avatar Mar 08 '20 15:03 fjl

@fjl might make sense to first describe the packets and then attempt to do RLP later on. Would be a good first step imo.

decanus avatar Mar 08 '20 15:03 decanus

Yes, sounds good to me. Maybe pick one of the specs and convert it to ABNF so we can see what that looks like.

fjl avatar Mar 09 '20 19:03 fjl

I've looked at a bunch of ways to describe binary data in the last couple weeks, and the option I liked the most is the notation used in the QUIC spec drafts: https://quicwg.org/base-drafts/draft-ietf-quic-transport.html#section-1.3

This notation works great for binary layout descriptions. I like it because it's very 'vertical', unlike the ABNF variants, which are closer to the 'concatenation formula' style we have now. Example:

Example Structure {
  One-bit Field (1),
  7-bit Field with Fixed Value (7) = 61,
  Field with Variable-Length Integer (i),
  Arbitrary-Length Field (..),
  Variable-Length Field (8..24),
  Field With Minimum Length (16..),
  Field With Maximum Length (..128),
  [Optional Field (64)],
  Repeated Field (8) ...,
}

While this works for many things, we still need to keep the formula-style notation for crypto pseudocode.

We also still need a way to describe RLP structures in a sane way. The notation we have for RLP purposes is this kind:

x = [list-elem, list-elem, [sublist-elem, ...]]

It looks very clean, but is always a bit of a challenge because there is no good way to put type/size information into this notation. We also have an RLP notation with types, which we use in the eth protocol spec:

hello = [protocolVersion: P, networkId: P, td: P, bestHash: B_32, genesisHash: B_32, forkID]

But that one was never formally described anywhere and I don't even remember what all the letters mean.

fjl avatar Jun 04 '20 15:06 fjl