ygot
ygot copied to clipboard
Should Binary be string instead of []byte?
[]byte is unhashable, which means it currently causes the code to not compile when it's the key of a list. However, the new union API had wanted to support this partially by implementing union methods for the pointer receiver of Binary. This creates an inconsistency, and this "partial" support is really the wrong support, because pointer receiver doesn't work for maps because it would hash the pointer value instead of the underlying bytes. I see two options looking forward:
- []byte is really just a string. So, change the definition of
Binarytotype Binary string. This solves the hashability problem. I'll need to look into what changes this entails, but I feel it should be a feasible change. - If we're not going to support
binarybeing part of YANGlists, then we should use the value receiver when makingBinaryimplement unions (as part of the new union API). This makes the behaviour consistent between single-keyed lists, multi-keyed lists, and union-keyed lists.
Is this similar to #426 ?
Yep, thanks for linking
From the ideas in the link as well as here it looks like using string to represent binary seems the most feasible.
I think it's worth considering which one of these is more suitable:
type Binary stringtype Binary struct { value string }(inspiration fromstrings.Builder)func (Binary) Set([]byte)func (Binary) Append([]byte)func (Binary) AppendByte(byte)func (Binary) Bytes() []bytefunc (Binary) Len() intfunc (Binary) Reset()
The problem with using string directly is that in Go,
- A string literal, absent byte-level escapes, always holds valid UTF-8 sequences.
- No guarantee is made in Go that characters in strings are normalized.
By forcing the user to work with []byte, we always remind them that they're working with bytes, and not UTF-8 strings, which at the expense of ease-of-use, avoids misuse due to the above two issues.
To put to rest using a fixed-sized array, i.e. type Binary [N]byte:
https://datatracker.ietf.org/doc/html/rfc7950#section-9.4.4
An implementation is not required to support a length value larger than 18446744073709551615.
So to make this more concrete, I think the potential downside to using a string as the underlying type for Binary is that they can only use this pattern to loop through:
const nihongo = "日本語"
for i, w := 0, 0; i < len(nihongo); i += w {
runeValue, width := utf8.DecodeRuneInString(nihongo[i:])
fmt.Printf("%#U starts at byte position %d\n", runeValue, i)
w = width
}
instead of the wrong range pattern:
const nihongo = "日本語"
for index, runeValue := range nihongo {
fmt.Printf("%#U starts at byte position %d\n", runeValue, index)
}
The question here is whether this change is worth the hashability gain we get, which allows us to have binary as list keys.
As I started to implement and fix the tests the list of requirements became clearer. The type must have the following properties:
Requirement List
- comparable (aka. hashable)
- stored/passed around by value (in order to take advantage of comparability)
- This can be violated, but if so, a special union type must be used to avoid misuse, since we want to avoid the pointer value being assigned to a union interface representing a list key.
- value is referrable by an interface (in order to be used directly in a union)
- This can be worked around by using a special union type, in the likes of
UnionUint32,UnionString, etc.
- This can be worked around by using a special union type, in the likes of
- methods (e.g.
Set) can modify the type - value can indicate non-presence
Based on these requirements, I have a detailed proposal below.
Potential Problems with Proposal:
s, being private, makes it harder to usecmp.Diffandcmp.Equal(needAllowExported), thoughreflect.DeepEqualis not affected.type BinaryType interfaceis used as the type for each simple binary field, requiring the user to call.Binary()before they can access the underlyingBinarytype, which can be annoying.
// BinaryType is used as the type for each simple "binary" leaf field. This is done in order to indicate
// non-presence for leaf fields.
type BinaryType interface {
Binary() Binary
Bytes() []byte
Len() int
}
type Union interface {
IsUnionType()
}
func (b Binary) IsUnionType() {
}
type S struct {
b BinaryType `path:"state/b"`
bs []Binary `path:"state/bs"`
u map[Union]string `path:"state/u"`
ub Union `path:"state/ub"`
}
// Binary is a string-based, hashable type used to represent the YANG binary type.
type Binary struct {
s string
}
func MakeBinary(bs []byte) Binary {
return Binary{string(bs)}
}
func (b Binary) Binary() Binary {
return b
}
func (b *Binary) Set(bytes []byte) {
if b != nil {
b.s = string(bytes)
}
}
func (b *Binary) Append(bytes []byte) {
if b != nil {
b.s += string(bytes)
}
}
func (b *Binary) AppendByte(char byte) {
if b != nil {
b.s += string(char)
}
}
// NOTE: I'm a value receiver!
func (b Binary) Bytes() []byte {
return []byte(b.s)
}
// NOTE: I'm a value receiver!
func (b Binary) Len() int {
return len(b.s)
}
func (b *Binary) Reset() {
if b != nil {
b.s = ""
}
}
https://go.dev/play/p/7ACc3_GPNtB (this example exercises this API)
If you define an Equal method for Binary as in https://go.dev/play/p/MPwL_KjXQOF then you can use cmp.Equal without any other arguments. cmp will use an Equal method for a type if one is available.
I would also suggest that we rename MakeBinary to NewBinary since I think this is the usual pattern for "create me an X" methods.
I like this design though, it seems to work nicely and give us a lot of flexibility.
Thanks for the tip on Equal!
The reason I'm using MakeBinary is that it doesn't return a pointer. New in Go usually returns a pointer (e.g. new(uint32), whereas make() is used to instantiate things that's not a pointer.
Hi @wenovus, is there a plan to commit this change in the near future?
I am also interested in whether we can help move this issue to a PR and help testing.
I hope to take a look next month. Does that work for you?
Thanks @wenovus , looking forward to it.