ion-docs icon indicating copy to clipboard operation
ion-docs copied to clipboard

Clarify status of NOP Padding in Binary-encoded sorted structs

Open PeytonT opened this issue 4 years ago • 2 comments

Related to #107.

From http://amzn.github.io/ion-docs/docs/binary.html#13-struct

When L is 1, the struct has at least one symbol/value pair, the length field exists, and the field name integers are sorted in increasing order.

NOP Padding in struct values requires additional consideration of the field name element. If the “value” of a struct field is the NOP pad, then the field name is ignored. This means that it is not possible to encode padding in a struct value that is less than two bytes. Implementations should use symbol ID zero as the field name to emphasize the lack of meaning of the field name.

Should the requirement that the field name integers of a sorted struct be sorted apply before or after ignoring the field name of a field with a NOP pad "value"? The fact that the first clause refers to "field name integers" and the second clause refers to "the field name" suggests that the instruction in the second clause applies after symbol resolution. This suggests that the field name integers of NOP pad "values" are still required to be sorted in increasing order.

This can in general be achieved by ignoring the "should" guidance and using appropriately sized field name integers. Perhaps that is why this is a "should" and not a "must"?

PeytonT avatar Mar 10 '20 05:03 PeytonT

I don't think there's a reason to have a special case for NOP padding here. In other words, the fields of NOP pad "values" must still be sorted properly.

tgregg avatar Mar 11 '20 19:03 tgregg

That seems reasonable, and is the impression I get from the spec. But it has the bizarre consequence that writers "should" not insert NOP padding anywhere other than the beginning of an ordered struct, since to do otherwise would require using field name integers other than zero, which they "should" not do.

Since the bizarre consequence is the consequence of this statement, which doesn't seem to have any other purpose, perhaps it can simply be removed?

Implementations should use symbol ID zero as the field name to emphasize the lack of meaning of the field name.

It's a "should" guidance, so writers are already free to ignore it and readers must not assume it, and it doesn't seem to confer any optimization potential in well-behaved systems.

PeytonT avatar Mar 12 '20 02:03 PeytonT