gc Reserve 0 byte in all declarations and accesses

We've learned the lesson from both positive and negative examples that future proposals often need to add things to existing constructs. Where we've had the foresight to do so, a 0 byte has been extremely helpful.

I'd like to propose that all declarations in this proposal (i.e. arrays and structs) have an additional 0 byte in the binary encoding for future extensibility. We may also consider the question of whether instructions that access them should ever require extensibility, and thus also need a 0 byte.

Jun 02 '21 15:06 titzer

The alternative would be to plan to use different type codes or operation codes for future extended versions. As far as I can see, the main drawback of this alternative is that according to our current conventions, that would mean we would have to come up with new names for the extended versions, while using a reserved zero byte would not require new names. But that consideration is mostly cosmetic. Does reserving zero bytes have other advantages over extension via new opcodes?

Jun 02 '21 17:06 tlively

What effect would this have on code size?

Jun 02 '21 17:06 fgmccabe

Reserving zero bytes would be a strict increase in code size, of course. However, I think it would be small, on the order of a couple percent for a module that uses GC, even heavily. E.g. a struct declaration has at minimum an LEB followed by N type x mutable declarations. For, say, a struct with 4 fields, it will add 1 byte to a 9+ byte encoding. But empirically modules are 90% code. It would increase the size of a struct.get instruction (minimum 4 bytes) by 1 byte. We'd have to measure for gc-heavy programs what proportion of instructions those represent.

I think zero bytes have the advantage that it keeps the opcode space and declaration encoding space better organized. In decoders, its effect depends on exactly how the code is organized. For (in-place) interpreters, it is a tradeoff between having more entries / handler versions in a dispatch table versus having a branch or skipping the byte.

Jun 02 '21 22:06 titzer

I measured the code size impact of adding a zero byte to every type declaration, adding a zero byte to every {struct,array}.{get*,set}, and adding both:

	uncompressed	uncompressed ratio	gzip	gzip ratio	brotli	brotli ratio
no reserved bytes	7523115	1	1935289	1	1291117	1
type declaration byte	7546501	1.003108553	1936486	1.000618512	1291685	1.000439929
accessor byte	7661635	1.018412586	1944626	1.004824602	1294665	1.002748008
declaration + accessor bytes	7685021	1.021521139	1945847	1.005455516	1296033	1.003807556

As expected, the code size impact is small. That being said, I would prefer to introduce new opcodes for future types and instructions rather than using reserved bytes. There's also the problem of maintaining backward compatibility for function type declarations, which do not have reserved bytes.

Mar 03 '23 19:03 tlively

Reflecting on this, a zero byte might be a bit clunky, and we are generally inconsistent on using them these days. I'm OK closing this issue.

Mar 04 '23 02:03 titzer

Thanks to you both, @tlively, @titzer, I'm closing.

Mar 04 '23 10:03 rossberg