interface-types
interface-types copied to clipboard
Should snowman-bindings check unions?
In the snowman-bindings presentation at the June meeting, one of the proposed binding is for enum/union/sum types.
I'm wondering, should these types be statically guaranteed to be valid?
For instance, if a function takes a bool
as a parameter, where bool
is defined as an enum of 0 or 1, can that function safely assume that the parameter will always be either 0 or 1, even when being called by malicious code?
If so, how do you think this guarantee could be enforced?
I think a ☃-binding type should precisely define a (possibly infinite) set of valid values and the semantics/implementation would guarantee that only one of those valid values was actually passed. Thus, if we have a bool
☃-binding type, it can only take a true
or false
, and it's up to the binding expressions to map to and from these two boolean values to (presumably) a wasm i32
. On the producing end, a i32-to-bool
binding operator would presumably map all non-zero values to true
. On the consuming end, a bool-to-i32
binding operator would presumably map to either 0
or 1
.
Yes. I think that this is right for booleans. There are other cases to consider: optional types, promises, days of the week, error codes...
For instance, if a function takes a bool as a parameter, where bool is defined as an enum of 0 or 1, can that function safely assume that the parameter will always be either 0 or 1, even when being called by malicious code?
Yes. For enum types, e.g. bool, the binding generated by the host needs to either assert that the input matches, or coerce the value. So the caller could have a i32-to-enum-assert
coercion that would say assert(val >= 0); assert(val <= 1);
. A different caller could have a i32-to-enum-clamp
coercion that would say val = min(max(val, 0), 1);
In either case, the callee will only see 0s or 1s.
For more general ADTs, such as a Either a b = Left a | Right b
, that could desugar to a C++-style:
enum EitherTag { Left, Right };
template <typename A, typename B>
struct Either {
EitherTag tag;
union {
A leftValue;
B rightValue;
};
};
and thus use a similar i32-to-enum
binding strategy on the tag
field. The ADT binding operator would need to do some handwavium (specifically, also read the type tag) to determine whether to use a (user-defined) A-to-ref
or B-to-obj
binding, but in principle it's the same logic.
Note: I'm not saying we need two i32-to-enum bindings, but there are (at least) two strategies that preserve the guarantee on the callee's side that the values are valid, enforceable by the host inserting dynamic checks.
For more general ADTs, such as a
Either a b = Left a | Right b
, that could desugar to a C++-style:enum EitherTag { Left, Right }; template <typename A, typename B> struct Either { EitherTag tag; union { A leftValue; B rightValue; }; };
I don't disagree with you, but I wonder, if a function passes a range of enums, does the host check all of them? That might be an undesirable hidden cost. Similarly, in the example above, does the host check the validity of A or B (depending on tag)?
I wonder if you couldn't directly declare the variables as enums on the stack, so most of the checks can be performed at compile-time instead of run-time.
Note: I'm not saying we need two i32-to-enum bindings, but there are (at least) two strategies that preserve the guarantee on the callee's side that the values are valid, enforceable by the host inserting dynamic checks.
I vote for trap all the time. It's simpler that way, and if the compiler wants the other strategy, it can always perform the modulo itself; the host may or may not elide the runtime check.
On Thu, Jul 25, 2019 at 2:52 AM Olivier FAURE [email protected] wrote:
For more general ADTs, such as a Either a b = Left a | Right b, that could desugar to a C++-style:
enum EitherTag { Left, Right };template <typename A, typename B>struct Either { EitherTag tag; union { A leftValue; B rightValue; }; };
I don't disagree with you, but I wonder, if a function passes a range of enums, does the host check all of them? That might be an undesirable hidden cost. Similarly, in the example above, does the host check the validity of A or B (depending on tag)?
a. The 'host' may generate code to handle enums. However, this is not webIDL :) Remember that the host only 'generates' code for pairs of coercions; so, if such a pair can be implemented in a pass-through way that is what will happen. b. When you have discriminated unions like the Either type, there has to be a way for the client (provider) to signal to the host which of the unions is active. There will likely be a coercion operator that allows the client (provider) to do that. In general, this will not be done by the host inspecting the value in some quasi magical way.
I wonder if you couldn't directly declare the variables as enums on the stack, so most of the checks can be performed at compile-time instead of run-time.
Note: I'm not saying we need two i32-to-enum bindings, but there are (at least) two strategies that preserve the guarantee on the callee's side that the values are valid, enforceable by the host inserting dynamic checks.
It is not necessary to declare the enums. In fact, I would not recommend a dynamic approach: the coercion operator has to be able to deterministically determine how to map the enum.
I vote for trap all the time. It's simpler that way, and if the compiler wants the other strategy, it can always perform the modulo itself; the host may or may not elide the runtime check.
This may be an MVP/later issue. Depends on how quickly we proceed cf the exceptions proposal.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/webidl-bindings/issues/50?email_source=notifications&email_token=AAQAXUDFEVJNEC3BRIDJC4TQBFZVTA5CNFSM4IFSZRVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2Y7H6A#issuecomment-514978808, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUG62EFESGSUVUDVXNDQBFZVTANCNFSM4IFSZRVA .
-- Francis McCabe SWE