ABNF packets
At vac we've started defining packets using ABNF, it might make sense to do this in the devp2p specifications too. https://specs.vac.dev/specs/waku/waku.html#abnf-specification
Yes! That's a very nice idea. We should use that for all specs. Maybe we could even define a nice RLP extension for ABNF and add it as a meta spec.
@fjl would be fun to work on that, am happy to help. I made the demo PR mainly to show.
currently wondering if it makes sense at all for RLP or only for packets.
If there is a neat way to describe RLP with ABNF, let's go for it. Right now we use [ x, y, ... ] notation for RLP lists, and the square brackets mean recursive encoding. This notation works most of the time, but cannot describe the cases where we want to concatenate multiple RLP-encoded values. I used notation like rlp_bytes(x) for that (see here), but it doesn't look nice.
@fjl might make sense to first describe the packets and then attempt to do RLP later on. Would be a good first step imo.
Yes, sounds good to me. Maybe pick one of the specs and convert it to ABNF so we can see what that looks like.
I've looked at a bunch of ways to describe binary data in the last couple weeks, and the option I liked the most is the notation used in the QUIC spec drafts: https://quicwg.org/base-drafts/draft-ietf-quic-transport.html#section-1.3
This notation works great for binary layout descriptions. I like it because it's very 'vertical', unlike the ABNF variants, which are closer to the 'concatenation formula' style we have now. Example:
Example Structure {
One-bit Field (1),
7-bit Field with Fixed Value (7) = 61,
Field with Variable-Length Integer (i),
Arbitrary-Length Field (..),
Variable-Length Field (8..24),
Field With Minimum Length (16..),
Field With Maximum Length (..128),
[Optional Field (64)],
Repeated Field (8) ...,
}
While this works for many things, we still need to keep the formula-style notation for crypto pseudocode.
We also still need a way to describe RLP structures in a sane way. The notation we have for RLP purposes is this kind:
x = [list-elem, list-elem, [sublist-elem, ...]]
It looks very clean, but is always a bit of a challenge because there is no good way to put type/size information into this notation. We also have an RLP notation with types, which we use in the eth protocol spec:
hello = [protocolVersion: P, networkId: P, td: P, bestHash: B_32, genesisHash: B_32, forkID]
But that one was never formally described anywhere and I don't even remember what all the letters mean.