memory64 icon indicating copy to clipboard operation
memory64 copied to clipboard

BigInt parameters for the JS API?

Open bvisness opened this issue 1 year ago • 8 comments

As I finalize our implementation of table64, I'd like some clarity on the JS API. Some previous comments (e.g. https://github.com/WebAssembly/memory64/issues/46#issuecomment-2072933689, https://github.com/WebAssembly/memory64/issues/46#issuecomment-2072979502) have suggested that .get, .set should accept both number and BigInt for their indices.

However, I'd like to clarify a couple things:

  1. i64 params must be BigInt (per ToWebAssemblyValue), so should BigInt be required for Table methods as well?
  2. What should Table.grow take? Memory.grow currently takes a number, even for memory64, despite memory.grow taking an i64. Perhaps this is just because the param is in units of pages? (But, if custom page sizes are coming down the pipe, this reasoning won't hold up for long.)

My suggestion would be for all of Table.get, Table.set, Table.grow, and Memory.grow to require BigInt when the index (address :slightly_smiling_face:) type is i64. This is consistent with ToWebAssemblyValue(v, i64), and therefore consistent with the core wasm versions of these instructions.

Altering Memory.grow should also be backwards-compatible at this stage since memory32 is unaffected - although it may cause slight difficulty for toolchains in practice. Worth it imo (but I am not a toolchain author).

bvisness avatar Jun 28 '24 22:06 bvisness

The interaction of this proposal with custom page sizes is a good point! I think we have to make memory.grow/size use the index type to count pages in this proposal, otherwise it will not be forward-compatible with custom page sizes, and impossible to fix later. I opened #69 for that.

rossberg avatar Jun 29 '24 05:06 rossberg

In terms of the webidl I think we can switch from unsigned long to unsigned long long. See #70.

I'm not sure how/where we can make the argument type dependent on the type of the memory/table in the JS API, or if that is something that is needed here.

sbc100 avatar Jul 01 '24 22:07 sbc100

In terms of the webidl I think we can switch from unsigned long to unsigned long long. See #70.

I'm not sure how/where we can make the argument type dependent on the type of the memory/table in the JS API, or if that is something that is needed here.

Conversions between JS values and WebIDL unsigned long long are only guaranteed to preserve the same integer up to MAX_SAFE_INT (2^53). I'm not sure exactly why, but bigint is not allowed in that conversion.

I don't know if we expect 2^53 to be a problem or not, but if we expect it to we should probably use bigint instead of unsigned long long. But that may not be backwards compatible either for property getters, unless we create new ones.

eqrion avatar Jul 02 '24 21:07 eqrion

Do you know if there is some way to represent polymorphic methods such as get and set that can take either an unsigned long or bigint depending on the type of the receiver?

I think in practice just allowing 2^53 integer values is also fine too (and we could add the bigint forms later if we decide they really are needed).

sbc100 avatar Jul 02 '24 21:07 sbc100

Do you know if there is some way to represent polymorphic methods such as get and set that can take either an unsigned long or bigint depending on the type of the receiver?

I think in practice just allowing 2^53 integer values is also fine too (and we could add the bigint forms later if we decide they really are needed).

I believe it's valid to do unsigned unsigned long or bigint (using unions) in WebIDL, but that will allow either kind of value always (not depending on the memory type).

It also has a scary warning not to do this if you scroll down a bit. But I don't know why that is.

eqrion avatar Jul 02 '24 21:07 eqrion

That is a scary warning. It does seems reasonable to allow both numbers and BigInt. For example, .get() and .set() on a table are logically equivalent to accessing elements in a typed array which allows both number and bigint indexes.

sbc100 avatar Jul 03 '24 15:07 sbc100

For now I'm going to proceed with JS numbers instead of BigInts, but I think we will need to carefully consider this question before phase 4.

Tables are really less of a concern here, since the defined implementation limit for table size is a mere 10M. I doubt that we will feel much pressure to increase that limit in the foreseeable future, much less above 2^32.

Memories are the real issue, since there is a possible future where browsers support both large 64-bit memories and a custom page size of 1 byte. The size and grow methods might then be in i64 territory. However, since MAX_SAFE_INTEGER is 2^53-1, it still may be reasonable to assume that browsers will never allow memories to be large enough to cause problems. (Perhaps there are non-web users of the JS API who would prefer to have memories that large? But with most consumer address spaces being 48 bits in practice, this may still be too remote a concern.)

I admit I'm confused why it would be a problem to simply change the parameter type according to the index type of the memory or table - from what I can tell, it's purely a restriction of WebIDL? I still would personally prefer for all JS API parameters to be consistent with ToWebAssemblyValue, so if the WebIDL can somehow express this, that would get my vote.

(Part of the reason I would prefer to stick with ToWebAssemblyValue is that new proposals like custom page sizes can occasionally come up and cause problems due to the divergence between core wasm and the JS API. Imagine if that proposal had come along after Memory64 reached phase 4!)

bvisness avatar Jul 08 '24 20:07 bvisness

I guess the risk with going with JS numbers for now is that we would need to continue to support that going forward, even if we later decide to go with bigint in the case of 64-bit indexes tables/memories. To me, it does seem reasonable to be able to use either JS numbers of bigint values, but that does go against current pattern in JS of not allowing these two types interchangeably.

For example, it seems perfectly reasonable to be able to write memory.grow(1) or memory.grow(1n) to grow the memory by a single page, even for 32-bit memories. Just like today I can do do memory.buffer[0n] or memory.buffer[0] to access the first byte of a 32-bit memory.

sbc100 avatar Jul 08 '24 20:07 sbc100

We still have concerns about this issue, and it seems to me that there are three options:

  1. Accept only numbers
  2. Accept only BigInts
  3. Accept both numbers and BigInts

I think 3 is clearly a bad decision, at least for now. The question of accepting numbers for i64 parameters was discussed at length in https://github.com/WebAssembly/JS-BigInt-integration/issues/12, and the conclusion at the time was "Authors are expected to use bigint consistently when they're planning to use an i64 API." I am personally sympathetic to option 3 (I agree with Jakob Kummerow's analysis), but that is not how things work today. I would not want to accept option 3 without accepting it for all i64 parameters in the JS API.

I think 1 is also a bad decision, for the same consistency reason expressed above: "Authors are expected to use bigint consistently when they're planning to use an i64 API." Under option 1, using the JS API would require users to convert back and forth between BigInts and numbers seemingly randomly, depending on which APIs they happen to use. Furthermore, it seems crazy to reject BigInts for i64 parameters, knowing full well that JS code will likely already be working with BigInts if using memory64.

Therefore, I think option 2 is the right choice. We should only accept BigInts, and reject numbers, for 64-bit memories and tables.


The question, then, is simply how to express option 2 in the spec. Accepting BigInts requires us to somehow either overload the affected functions or use a union on each parameter. The WebIDL spec expressly forbids overloading where the only distinction is a BigInt and a number, so overloading is not feasible. A union of a BigInt and a numeric type is valid, but comes with a scary warning.

However, after re-examining the warning, it seems that using a union in this case would be totally fine. Option 2 would not require numbers to be converted to BigInts; it simply proposes that various APIs would accept either numbers (for memory32) or BigInts (for memory64) and trap if the incorrect type is received. Therefore there is no concern about precision.

So, I think the following pattern would work for all affected APIs:

any get(([EnforceRange] unsigned long long or bigint) index);

I would be happy to submit a PR that makes this change across the board, although I might need some guidance updating the rest of the spec text.

bvisness avatar Jul 29 '24 21:07 bvisness

@bvisness, thank you for the detailed analysis. I have been thinking along the same lines. i.e. That (2) is the way to go. Its unfortunate since it will break existing code such as that in emscripten, but I arguments are fairly strong to going in this direction.

It was pointed out to me that the current spec for table.get and table.set in the JS API might be lacking calls to ToWebAssemblyValue for the index argument. I think we should probably submit a change to main spec repo to add those then then downstream in this proposal we can make the target type conditional on the type of the table.

Any PRs, or html with these changes would be most welcome. I will try to update emscripten to include some kind behaviour detection so that we its output can function both before and after we make this change.

sbc100 avatar Jul 30 '24 18:07 sbc100

I'll take a crack at it and get back to you.

For what it's worth, I would really prefer option 3 in the long term. But, I think it would be best to ship memory64, and then pursue a new proposal to relax i64 parameters in the JS API.

bvisness avatar Jul 30 '24 20:07 bvisness

It was pointed out to me that the current spec for table.get and table.set in the JS API might be lacking calls to ToWebAssemblyValue for the index argument.

I don't think this is necessary in the current spec, because the abstract WebAssembly spec unconditionally requires a u32 (link), and the index parameter is specified as [EnforceRange] unsigned long. It's not like the ref parameter on table.set, which is any in WebIDL and therefore needs ToWebAssemblyValue to constrain it.

bvisness avatar Jul 31 '24 18:07 bvisness

the abstract WebAssembly spec unconditionally requires a u32 (link)

Ah, but this actually is an oversight. @sbc100, I think we need to update the embedder spec to use u64 for all indices.

rossberg avatar Jul 31 '24 18:07 rossberg

Assuming it doesn't need to be in the main spec today, I can do that as part of my other memory64 spec updates.

bvisness avatar Jul 31 '24 19:07 bvisness