stringref icon indicating copy to clipboard operation
stringref copied to clipboard

stringref should be non-nullable by default?

Open wingo opened this issue 3 years ago • 4 comments

The only thing in this proposal that can usefully handle a null stringref is string.eq. This makes me think that the instructions in this proposal should generally consume and produce (ref string), not (ref null string).

It even makes me think that stringref should be a shorthand for (ref string) instead of (ref null string). However all the other shorthands are nullable. My impression is that this is because the other shorthands are abstract; you can use a non-nullable type when you have a little more information on where a value comes from and what you can do with it, but for e.g. externref you can't do anything with it anyway, so it can be nullable, why not. Also, these top types are appropriate for storage locations (table slots etc) which have to be nullable. So I see the reason for e.g. externref to be nullable but I don't necessarily think it applies to stringref, which is a concrete type.

Options:

  1. Leave as is: instructions consume nullable types. (Though, we recently changed instructions that produce stringref values to make non-nullable types; #42.)
  2. Change everything except string.eq to take (ref string), (ref stringview_wtf8), and so on: non-nullable types.
  3. (2), but also change stringref to be shorthand for (ref string) instead of (ref null string).
  4. (2), and remove the stringref (and stringview_wtf8, etc) shorthands. Just use the two-byte ref forms.

Thoughts?

Related to #42, #18, and #14.

wingo avatar Sep 13 '22 06:09 wingo

PR #48 chooses option (3). But we could just as well do any of the others. Interestingly it would seem that the examples in the overview are still well-typed with this change; the "init"-tracking of locals works out for us.

wingo avatar Sep 13 '22 07:09 wingo

Thinking about this a bit more, with regards to the dimensions of performance and binary size:

  • It doesn't matter if a stringref instruction consumes a nullable operand. If the generator of the wasm wants to use non-nullable strings, it can, and no null check will be emitted in that case. Or the generator can decide not to and it will get null checks. Not a huge deal.
  • Producing non-nullable types makes sense, because otherwise generators that want non-nullable types will have to downcast. Granted, an optimizing compiler will remove these, but it's annoying and adds to binary size.

Of course, it still doesn't make sense to consume a null operand :P I still have the feeling that if we have a shorthand, that it should nudge people in the right direction, and so stringref should be non-nullable.

wingo avatar Sep 13 '22 13:09 wingo

In the GC proposal we standardized on shorthands always denoting nullable references, so it would be good to be consistent with that choice here.

The GC proposal's operations also generally accept nullable inputs and produce non-nullable outputs (where possible), so it would be good (for both consistency and binary size) to do the same here.

tlively avatar Mar 02 '23 17:03 tlively

Related to the concept of "generally accept[ing] nullable inputs", it would be good to clarify whether the array parameters consumed by string.{new,encode}_*_array are nullable (I assume they are, but the current text in Overview.md doesn't make that clear).

jakobkummerow avatar Jun 27 '23 10:06 jakobkummerow