spec icon indicating copy to clipboard operation
spec copied to clipboard

What to do about maxByteLength for resizable buffers of WebAssembly.Memorys?

Open syg opened this issue 7 months ago • 25 comments

Problem

WebAssembly.Memory instances are getting a toResizableBuffer() method per #1292.

Resizable ArrayBuffers have a maxByteLength property. For 32bit memories, the maximum spec byte size of a WebAssembly.Memory is 65536 pages, or 2^32 bytes. For 64bit memories, the the maximum size can easily be > 65535 pages.

The problem is any max sizes > 65535 pages are unrepresentable in 32bits as bytes. V8 and SpiderMonkey use size_t to represent lengths for ArrayBuffers, and on 32bits, size_t is only 32bits.

For 32bit memories, the only unrepresentable value is the spec maximum of 65536 pages. For 64bit memories the problem is worse, as the whole point is to have larger memories.

Other interesting properties you might care about:

  • WebAssembly.Memory does not require a maximum for unshared memories; ArrayBuffer() always require a maximum
  • WebAssembly.Memory's maximum is not inspectable by user code; ArrayBuffers' maximum is inspectable

Possible solutions

Rejected proposals

1a. Clamp maxByteLength to a single, arch-independent value

If mem.toResizableBuffer() is called, clamp the returned ArrayBuffer's maxByteLength to some single, arch-independent value, like 65535 * 65536.

Pros

  • Architecture independent

Cons

  • Significantly reduces utility for 64bit memories on 64bit architectures
  • Mismatch with JS API where the maximum passed to the WebAssembly.Memory constructor differs from the maxByteLength property.

1b. Clamp maxByteLength on 32bit architectures only

If mem.toResizableBuffer() is called, clamp the returned ArrayBuffer's maxByteLength to 65535 * 65536 only on 32bit architectures.

Pros

  • Minimal change

Cons

  • Architecture dependent: usage of 64bit memories need to be aware if it's running on 32bit or 64bit machines. FWIW this awareness is already required for pure JS usage of resizable buffers.
  • Mismatch with JS API where the maximum passed to the WebAssembly.Memory constructor differs from the maxByteLength property.

2a. Throw in toResizableBuffer() if maxByteLength exceeds a single, arch-independent limit

If mem.toResizableBuffer() is called, and mem's maximum byte size exceeds some single, arch-independent value, like 65535 * 65536, then throw a RangeError.

Pros

  • Sidesteps the question of matching the JS API

Cons

  • Significantly reduces utility for 64bit memories on 64bit architectures

2b. Throw in toResizableBuffer() if maxByteLength exceeds 2^32 on 32bit architectures only

If mem.toResizableBuffer() is called on 32bit architectures, and mem's maximum byte size exceeds 65535 * 65536, then throw a RangeError.

Pros

  • Sidesteps the question of matching the JS API

Cons

  • Architecture dependent

3. Change spec to throw early when constructing WebAssembly.Memorys with unsatisfable maximum sizes

If a maximum is passed to WebAssembly.Memory that is always unsatisfiable, e.g. > 65536 pages on 32bit architectures, throw a RangeError. For backwards compatibility with existing code, 65536 still needs to be accepted.

Cons

  • Architecture dependent
  • Against the spirit of the Wasm design of max size (?)

4. Change implementations to accommodate size values > 2^32 - 1

Pros

  • Transparent

Cons

  • Complexity in runtime implementations, as lengths flow into many arithmetic operations, with possible knock-on effects (e.g. on 32bit architectures, optimizing tiers can no longer assume byte lengths from TypedArrays and ArrayBuffers are always 32bits)

5. Have the resizable buffer report an engine-determined maxByteLength

(suggested by @lukewagner)

If mem.toResizableBuffer() is called, the returned ArrayBuffer can have an implementation-defined maxByteLength that is smaller than the requested max size passed to the WebAssembly.Memory constructor. This aligns more closely with Wasm memories' concept of max as a hint that the engine can decrease as it probes.

Pros

  • Trivial implementation

Cons

  • Nondeterminism
  • Mismatch with pure JS usage of RABs. But there is an argument to be made that the interpretation of maximum is just different between pure JS uses of RABs and Wasm memories. Pure JS uses of RABs are varied, while the "have the engine probe for an actual maximum" heuristic makes sense if you're allocating a Wasm program's heap memory and you want as much as you can get.
  • Doesn't compose as nicely with the decision made in #1871. I think if the CG chooses this option, we should reverse the decision in #1871.

6. Have maxByteLength be Infinity for Wasm memories

(suggested by @eqrion)

If mem.toResizableBuffer() is called, the returned ArrayBuffer will have Infinity as the value for maxByteLength. Also reverse the decision in #1871.

Open question on what mem.type() would return.

Pros

  • Sidesteps the representation issues
  • Deterministic
  • Captures the difference in interpretation of "max" between Wasm memories and JS uses of resizable ABs

Cons

  • In some engines (V8), resizable ABs still will have the implementation constraint of needing to always grow in place. Infinity may give the wrong intuition.

7. "Lie" in the maxByteLength getter

(after discussion with @eqrion)

See https://github.com/WebAssembly/spec/issues/1895#issuecomment-2895078022 for details.

Pros

  • No representation issue for the actual max
  • Deterministic

syg avatar Apr 22 '25 20:04 syg

  • Architecture dependent: usage of 64bit memories need to be aware if it's running on 32bit or 64bit machines. FWIW this awareness is already required for pure JS usage of resizable buffers.

Can you explain why this is?

Also, what happens when I try to construct a RAB with maxByteLength >= 2^32 directly from JS? Is this behavior platform-dependent?

tlively avatar Apr 22 '25 20:04 tlively

Can you explain why this is?

Can you be more specific? What is "this"?

Also, what happens when I try to construct a RAB with maxByteLength >= 2^32 directly from JS? Is this behavior platform-dependent?

Well, by spec, it's implementation-defined. The JS spec actually pretends it has infinite resources, but has a special carveout for allocating ArrayBuffers that says "a. If it is not possible to create a Data Block block consisting of maxByteLength bytes, throw a RangeError exception."

In V8, RAB acts basically like mmap. If you can't reserve the maxByteLength amount in virtual address space, it'll throw. So on 32bit, it's guaranteed to throw. On 64bit, I don't know that's still a lot to ask for so it probably also throws but I suppose it can succeed.

syg avatar Apr 22 '25 22:04 syg

Specifically, why does pure-JS usage of RABs already require knowledge of the host architecture?

tlively avatar Apr 22 '25 22:04 tlively

Specifically, why does pure-JS usage of RABs already require knowledge of the host architecture?

Ah, because the recommendation that you use the smallest max size that works for you. That's a very different number if you want to run on both 32bit and 64bit architectures, vs just 64bit architectures.

The interpretation of max is different for RABs than WebAssembly.Memory. RABs treat it like "I want some assurances that I can grow to this size later". So if it's a request that is unsatisfiable, it throws. WebAssembly.Memory seems to treat it more like "it's just a hint, do your best".

syg avatar Apr 22 '25 22:04 syg

Added an 5th option above as suggested by @lukewagner.

syg avatar Apr 22 '25 22:04 syg

Given that whether the architecture is 32-bit or 64-bit is pretty easy to suss out using these memory allocation and growth APIs anyway, I wouldn't mind the non-determinism of having different behavior on 32-bit and 64-bit systems here, especially if that allows us to do something more useful and scalable on 64-bit systems.

(1b), (2b), or (5) therefore seem nicest to me if we can't have (4).

tlively avatar Apr 22 '25 23:04 tlively

WebAssembly.Memory's maximum is not inspectable by user code; ArrayBuffers' maximum is inspectable

With the js-types proposal a memoryObj.type() property would be added that exposes the original source maximum value for a memory. I believe the proposal is implemented in V8 and SM, it's just not phase 4 yet.

That does report the value in wasm pages, and with the memory64 proposal the page value is a bigint for consistency with other wasm i64 values. There was quite a bit of discussion on this, but I can't find it because the memory64 repo was archived.

@syg Backing up just a little bit here, what is the point of the .maxByteLength field here? Are users actually expected to check it and do things with it? Or was it just added because it was easy to do so?

It seems a little bit of a long shot, but if we could just throw or not have that accessor for wasm memories represented as a RAB/GSAB, that would be nice solution. If users really want to know the maximum value, they could go to wasm for that value. In practice they probably know it already because they created the module.

eqrion avatar Apr 24 '25 17:04 eqrion

@syg Backing up just a little bit here, what is the point of the .maxByteLength field here? Are users actually expected to check it and do things with it? Or was it just added because it was easy to do so?

Good question. I don't think the possibility of leaving out that property was discussed in depth. It was easy to do, and nobody brought up any concerns.

It seems a little bit of a long shot, but if we could just throw or not have that accessor for wasm memories represented as a RAB/GSAB, that would be nice solution. If users really want to know the maximum value, they could go to wasm for that value. In practice they probably know it already because they created the module.

Not having the property would be too much of an inconsistency IMO, and hard to do mechanically.

Having it always throw is possible and perhaps even defensible, given the other behavioral differences that Wasm memory ABs have (can't shrink, can't grow by non-page-multiples). It does seem like a weird wart that might bite people here and there, but I have no principled argument against it.

syg avatar Apr 24 '25 17:04 syg

There was quite a bit of discussion on this, but I can't find it because the memory64 repo was archived.

I would hope that archiving the repo didn't destroy any information. The bug search feature still seems to still work. Perhaps this is the discussion you are looking for : https://github.com/WebAssembly/memory64/issues/68 ?

sbc100 avatar Apr 24 '25 17:04 sbc100

There was quite a bit of discussion on this, but I can't find it because the memory64 repo was archived.

I would hope that archiving the repo didn't destroy any information. The bug search feature still seems to still work. Perhaps this is the discussion you are looking for : WebAssembly/memory64#68 ?

Thanks, that's it. Glad to see the discussions are not gone, I just wasn't seeing them when I clicked "issues".

One related issue here I think is memory64 max lengths. It's valid to construct 64-bit wasm memories with a max of up to 2^48 pages (i.e. 2^64 bytes).

js> m = new WebAssembly.Memory({address: 'i64', initial: 0n, maximum: 281474976710656n});     
({})
js> m.type()
({maximum:281474976710656n, minimum:0n, address:"i64", shared:false})

This runs into the same size_t issue even on 64-bit systems. It also runs into the issue of Number.MAX_SAFE_INTEGER, which would make the value you want to clamp to probably much lower than 2^48-1 pages, or return a bigint.

I don't think option 4 is viable then, because we'd need to change our maxByteLength internal fields to be u128, not u64 to acommodate the memory64 largest maximum.

And if we're going down the route of clamping, the implementation defined behavior is going to be architecture specific on both 64-bit and 32-bit.

I think option 5 could work and has a reasonable interpretation, but I'm concerned with exposing that internal detail and if there are any cross-engine compatibility concerns with it. I'd need to review how we're actually computing this 'internal clamped maximum'.

If removing the maxByteLength property is not viable, I wonder if we could just have wasm memories return Infinity instead and then go back and allow wasm memories without maximums. Or throw on access, but that feels gross. It feels like shoehorning the wasm concept of memory maximum into the JS one does not work. If we need a rationalization for the behavior, we could say that some host resizable array buffers don't have a maximum.

eqrion avatar Apr 24 '25 22:04 eqrion

I like the Infinity idea!

tlively avatar Apr 25 '25 00:04 tlively

The Infinity proposal + going backing to allowing wasm memories without maximums is very interesting and not something I'd considered.

I need to think about it more but I like how it threads the needle for both the representation issue and capturing the intuition that the max is treated like a true hint for Wasm memories.

An open question for me for that proposal is that upthread, I learned that there's a proposal to expose a memoryObj.type() property. If maxByteLength is Infinity, but memoryObj.type() reports a finite length, that discrepancy seems pretty bad to me. Will that proposal change as well?

Edit: Added this option to the OP.

syg avatar Apr 25 '25 16:04 syg

An open question for me for that proposal is that upthread, I learned that there's a proposal to expose a memoryObj.type() property. If maxByteLength is Infinity, but memoryObj.type() reports a finite length, that discrepancy seems pretty bad to me. Will that proposal change as well?

Yeah that would be sort-of odd, but I think we definitely need a maximum from memoryObj.type() because that is 'type reflection' on what the user literally wrote in their program and the whole point of the proposal. I guess that could also apply to the maxByteLength property here, but the value representation issues with memory64 and memory32 on 32-bit archs are hard to work around.

We can represent the wasm memory type accurately in our API because we can design it for it, but we can't fit it in maxByteLength without clamping it, returning a special value (infinity), throwing on access, or throwing on conversion to RAB. I'm not really convinced there's much value in having a maxByteLength accessor here and it seems to be giving us trouble, so that's why I lean towards working around it with a special value or throw.

But also, Infinity is a hack and I'm not convinced it's the right way to go.

eqrion avatar Apr 25 '25 19:04 eqrion

I've thought about all the options again and don't really see a better solution than the Infinity hack, even with the discrepancy with the upcoming js-types proposal.

Throwing in the getter for Wasm memories is worse than Infinity because it breaks code that checks maxByteLength unnecessarily.

syg avatar May 06 '25 23:05 syg

I think a simpler solution has been in plain sight all along: recompute maxByteLength from the max pages of the corresponding WebAssembly.Memory instance on access. In the web embedding, the maximum value never exceeds MAX_SAFE_INTEGER, so we can keep returning a JS Number value.

The representation problem is really about keeping the value in the internal field. For Wasm memories in V8, that internal field is the engine-determined max that's <= the max passed in to the constructor. This can be kept as-is, and the user-exposed maxByteLength can just lie and give a different number if it's an ArrayBuffer corresponding to a Wasm memory. In V8 this logic is confined to just the implementation of the getter, and is trivial to implement.

@eqrion WDYT?

syg avatar May 08 '25 22:05 syg

I think a simpler solution has been in plain sight all along: recompute maxByteLength from the max pages of the corresponding WebAssembly.Memory instance on access. In the web embedding, the maximum value never exceeds MAX_SAFE_INTEGER, so we can keep returning a JS Number value.

Is that true with wasm memory64? As I mentioned above, with a 64-bit wasm memory you are allowed to ask for a maximum of 2^48 pages, which is 2^64 bytes, which goes above MAX_SAFE_INTEGER.

The representation problem is really about keeping the value in the internal field. For Wasm memories in V8, that internal field is the engine-determined max that's <= the max passed in to the constructor. This can be kept as-is, and the user-exposed maxByteLength can just lie and give a different number if it's an ArrayBuffer corresponding to a Wasm memory. In V8 this logic is confined to just the implementation of the getter, and is trivial to implement.

@eqrion WDYT?

I agree with this, and think it's basically the same in SpiderMonkey. I would need to audit some of the usages of the internal field but I think generally they're fine. The MAX_SAFE_INTEGER issue seems to be the biggest problem.

eqrion avatar May 09 '25 19:05 eqrion

Is that true with wasm memory64? As I mentioned above, with a 64-bit wasm memory you are allowed to ask for a maximum of 2^48 pages, which is 2^64 bytes, which goes above MAX_SAFE_INTEGER.

You're right for the core spec, but the JS embedding has this arbitrary limit of 262,144 pages, so 16G, or 2^34 byte length.

syg avatar May 09 '25 19:05 syg

Ah, I think we might have a spec ambiguity here. Ben linked me to this issue: #1863 he filed a while ago. We read that line as part of a section on runtime limits that can't be exceeded. So you could specify a maximum greater than that in the memory/table types, but if you ever tried to grow beyond the web impl limit you would deterministically trap.

So in Firefox, this line m = new WebAssembly.Memory({address: 'i64', initial: 0n, maximum: 281474976710656n}); succeeds, while Chrome throws an error.

This could work in our favor if we resolved the ambiguity in Chrome's favor and disallow any specified maximum above a certain limit. We would need to be comfortable never allowing a maximum that could exceed MAX_SAFE_INTEGER, or having a fallback plan for that.

This is also related to the issue of bigint vs number for i64 values. In the memory64 proposal we agreed to always represent 64-bit memory indices using bigints. This was motivated primarily for consistency with i64 values in ToWebAssemblyValue/ToJSValue which only work with bigint. So the memory constructor for i64 requires a bigint maximum, and the type reflection API returns using bigint. Having maxByteLength be a number is a slight inconsistency here, but maybe it's just the best we can do.

eqrion avatar May 09 '25 21:05 eqrion

#1863 is interesting. Indeed the concepts of syntactic limit vs runtime limits is a good summary of the divergence.

My preference is for all the JS embedding limits be syntactic limits, not runtime limits, but am open to what others think.

But even in the case of interpreting them as runtime limits, the maxByteLength of memory64 can be Infinity where it exceeds MAX_SAFE_INTEGER.

I think the alternative of having maxByteLength return BigInts for ABs of memory64s is riskier, as Numbers and BigInts in general do not mix well (by design), so having some ABs returning a different value type can lead to surprises.

syg avatar May 12 '25 21:05 syg

I don't think maxByteLength is important enough to resolve #1863 one way or another.

My revised proposal is:

  • "Lie" in the maxByteLength getter
  • For memory64, overflow to Infinity if the requested max exceeds MAX_SAFE_INTEGER

Regardless of which we way we come down on the question of syntactic or runtime limits, ArrayBuffers on the JS side won't be retrofitted to sometimes return BigInts for its indices and length values and so forth. Given that, the simplest way forward seems to be to combine with @eqrion's Infinity idea. @eqrion WDYT?

syg avatar May 12 '25 21:05 syg

Step 1, of '"Lie" in the maxByteLength getter' makes sense to me.

As for the memory64 issue, what if as a short term stop gap to #1863 we agreed to a guaranteed syntactic limit for memory max of at most MAX_SAFE_INTEGER bytes? You could continue having the more restrictive syntactic limit of 16GiB while we debate that. But a conservative 2^53 limit in the meantime (which SM would adopt) would avoid any clamping issues and be realistically good enough for anyone.

eqrion avatar May 14 '25 22:05 eqrion

As for the memory64 issue, what if as a short term stop gap to https://github.com/WebAssembly/spec/issues/1863 we agreed to a guaranteed syntactic limit for memory max of at most MAX_SAFE_INTEGER bytes? You could continue having the more restrictive syntactic limit of 16GiB while we debate that. But a conservative 2^53 limit in the meantime (which SM would adopt) would avoid any clamping issues and be realistically good enough for anyone.

Sure, that sgtm.

To be exact, you mean that in the JS embedding, creating a WebAssembly.Memory with a max number of pages >= 2^37 (= log2((2^53 - 1) / 2^16) always throws.

syg avatar May 15 '25 18:05 syg

Yes, we'd disallow that in the WebAssembly.Memory constructor, and also when compiling a wasm module with a memory type (imported or defined) with too large of a max. Also probably in the WebAssembly.validate method as well for consistency.

eqrion avatar May 15 '25 21:05 eqrion

As for the memory64 issue, what if as a short term stop gap to https://github.com/WebAssembly/spec/issues/1863 we agreed to a guaranteed syntactic limit for memory max of at most MAX_SAFE_INTEGER bytes?

Cool, sounds like a plan to me.

To recap, what we've converged on is as follows.

  • Only Wasm memories with a max page passed in during construction can be made into a resizable buffer
  • The .maxByteLength getter on ArrayBuffers that are actually Wasm memories will report the max pages passed in the Wasm memory constructor multipled by the page size, as a double value, regardless of the internal engine limit. This slows and complicates the implementation of the .maxByteLength getter, but that extra cost is inconsequential.
  • In the JS embedding, make MAX_SAFE_INTEGER the validation-time max byte size limit for Wasm memories
  • Punt on the general question syntactic-vs-runtime interpretation of limits

syg avatar May 20 '25 16:05 syg

@syg That all sounds good to me.

eqrion avatar May 28 '25 19:05 eqrion

It sounds like the next step here is to make a PR updating the JS API spec. Has anyone started working on this? If not, does anyone want to volunteer? :)

tlively avatar Jun 26 '25 22:06 tlively

I've not started working on this. I could take a look at it, but it'd probably be a week or two before I have time for it.

eqrion avatar Jun 27 '25 12:06 eqrion

Thanks, @eqrion, that would be great.

tlively avatar Jun 27 '25 15:06 tlively

The PR for this was merged. This should be able to be closed.

eqrion avatar Oct 02 '25 16:10 eqrion