discussions-and-proposals icon indicating copy to clipboard operation
discussions-and-proposals copied to clipboard

RFC: ArrayBuffer support in TurboModules

Open paradowstack opened this issue 2 months ago • 12 comments

Proposal: Adding first-class ArrayBuffer support to Codegen and TurboModules to enable zero-copy binary data exchange between JavaScript and Native modules.

View the rendered RFC

paradowstack avatar Oct 10 '25 10:10 paradowstack

Firstly - congrats on a very thorough and well written RFC!

do we need to implement any additional synchronization mechanism to make this change thread-safe?

It's my understanding that the JS engine and JSI itself isn't thread-safe either. In which case, I'd consider the limitation of thread-safely for ArrayBuffer an extension of that.

Should we introduce two kinds of buffers, mutable and read-only

I think that could be a nice addition indeed. I'd actually imagine a read-only variant would be used in most cases 🤔

Should this RFC focus on introducing only the basic and most valuable synchronous support for ArrayBuffer

Personally - I'd apply the 80-20 rule here and go for the least amount of work bringing most of the value and stick with the basic sync support.


Did you consider "views"? (DataView and typed arrays) and how those would interact with this feature? Could these be passed between JS and Native, if not - what's the failure case like? And should typed arrays be supported in codegen?

kraenhansen avatar Oct 12 '25 19:10 kraenhansen

i really like this. for real-time media or GPU pipelines, say camera ---> ML ---> WebRTC, efficient ArrayBuffer bridging would make a world of difference. it'd enable moving small binary payloads(LUTs, masks, uniform buffers) without having to serialize or clone data just to cross the bridge.

a few things worth clarifying though:

  • thread safety: how do concurrent module calls handle shared buffers? media pipelines often run multiple workers in parallel, so locking semantics matter.

  • read-only vs writable: some data(e.g frame masks or GPU uploads) should likely be immutable once passed to JS; being explicit here could save a lot of edge-case debugging imo.

gmemmy avatar Oct 13 '25 18:10 gmemmy

Firstly - congrats on a very thorough and well written RFC!

Thanks ❤️

do we need to implement any additional synchronization mechanism to make this change thread-safe?

It's my understanding that the JS engine and JSI itself isn't thread-safe either. In which case, I'd consider the limitation of thread-safely for ArrayBuffer an extension of that.

Agree with that.

Should we introduce two kinds of buffers, mutable and read-only

I think that could be a nice addition indeed. I'd actually imagine a read-only variant would be used in most cases 🤔

I agree with that, but after deeper investigation I couldn't find an easy and clean way to achieve that. One way would be to create a read-only view over the buffer when processing it, but that's on the developers. Also there is an active TC39 proposal for an Immutable ArrayBuffer which would provide a standardized, runtime-enforced way to prevent modifications to the buffer contents.

Should this RFC focus on introducing only the basic and most valuable synchronous support for ArrayBuffer

Personally - I'd apply the 80-20 rule here and go for the least amount of work bringing most of the value and stick with the basic sync support.

👍

Did you consider "views"? (DataView and typed arrays) and how those would interact with this feature? Could these be passed between JS and Native, if not - what's the failure case like? And should typed arrays be supported in codegen?

My idea is to have this solution type-agnostic as the underlying native classes, such as NSMutableData, java.nio.ByteBuffer and jsi::ArrayBuffer are an opaque containers for raw, uninterpreted bytes. If DataView or TypedArray is passed instead of ArrayBuffer, the developer should receive an appropriate warning.

paradowstack avatar Oct 14 '25 11:10 paradowstack

The semantics around memory ownership seem to deviate from the spec of ArrayBuffer in regards to transferring/detaching. I'm not sure of what the material consequences are, especially with existing code that handles ArrayBuffers, but it seems like this could break developer expectations in many ways.

I'd expect the buffer to be moved, not borrowed, when passing between JS and native (in both directions).

ArrayBuffer implementations are not thread-safe; if multiple threads simultaneously read from or write to an ArrayBuffer, race conditions can occur. To prevent this, developers must ensure that an ArrayBuffer is not accessed concurrently from different threads

This problem goes away if ownership is moved to the receiving thread. You shouldn't be able to even read an ArrayBuffer from multiple threads.

  1. Thread-safety of the ArrayBuffer - do we need to implement any additional synchronization mechanism to make this change thread-safe?

So in summary and to answer this unresolved question, I would say absolutely yes. And to do so by using moves and not borrows.

tom-sherman avatar Oct 14 '25 12:10 tom-sherman

Thanks @tom-sherman for your input!

Regarding this:

ArrayBuffer implementations are not thread-safe; if multiple threads simultaneously read from or write to an ArrayBuffer, race conditions can occur. To prevent this, developers must ensure that an ArrayBuffer is not accessed concurrently from different threads

This problem goes away if ownership is moved to the receiving thread. You shouldn't be able to even read an ArrayBuffer from multiple threads.

  1. Thread-safety of the ArrayBuffer - do we need to implement any additional synchronization mechanism to make this change thread-safe?

So in summary and to answer this unresolved question, I would say absolutely yes. And to do so by using moves and not borrows.

I agree that moving (transferring ownership) an ArrayBuffer coul be fundamentally safer and cleaner than borrowing it. However, the primary technical challenge remains: the current JSI and Hermes Runtime implementations do not expose a dedicated API for "detaching" an ArrayBuffer from the JavaScript side. Without true detachment, the only immediate way to transfer ownership is by "moving" the underlying buffer to the native thread and extending its lifetime accordingly. This addresses the memory management aspect but has a critical flaw:

  • JS Validity: The ArrayBuffer remains valid on the JS side. Its properties, such as byteLength, are not cleared.
  • Thread Safety Risk: The buffer can still be read or written to simultaneously from the JS thread while it is being used natively. This creates an easy opportunity for developers to violate thread-safety rules, even if documentation warns against post-transfer access.

I currently do not see a clean, safe path to fully implement buffer transfers that invalidate the JS reference. Achieving this requires dedicated changes to both the JSI specification and the underlying Hermes engine to introduce a proper detachment mechanism. Since I don't have a deep expertise in this topic, output from more experienced developers is really welcome and highly appreciated.

paradowstack avatar Oct 15 '25 11:10 paradowstack

I don't have any expertise as to how to solve the invalidation of JS references in Hermes and JSI, but I wanted to add another voice highlighting the importance of ownership transfer. As far as I am aware, JS does not have other instances where a developer needs to think about thread-safety - it is always thread safe by default. When working with threads (e.g. a Worker in node or a WebWorker in the browser), references are either copied or transferred (e.g. zero-copy, but the reference is removed in the source thread). A JS developer needing to be aware that they cannot modify an ArrayBuffer while it is being written in the native thread is a big ask, especially as we are likely talking about developers who are consuming native modules which use this code, and may not be familiar with thread-safety as a concept at all. This might be a really hard problem to solve though as @paradowstack says!

gmaclennan avatar Oct 16 '25 15:10 gmaclennan

The semantics around memory ownership seem to deviate from the spec of ArrayBuffer in regards to transferring/detaching. I'm not sure of what the material consequences are, especially with existing code that handles ArrayBuffers, but it seems like this could break developer expectations in many ways.

That said, it's worth noting that (unless I'm mistaken) this is already how JSI, Expo Modules, and Nitro Modules handle ArrayBuffers today.

On one hand, aligning with existing community behavior might make sense for practical and compatibility reasons. On the other hand, once this behavior becomes part of the core, its reach and visibility will likely expand far beyond those ecosystems, making the current de facto behavior less relevant over time.

Leaving this as an open question and summoning a few folks from the community for feedback!

grabbou avatar Oct 17 '25 14:10 grabbou

Thanks for putting this together.

Generally, I'm very aligned supporting a type-safe abstraction over the existing ArrayBuffer support in JSI, and there seems to be plenty of use-cases where this would become a good way forward to unlock cheaper data sharing between JS and native.

I agree with the concerns around thread-safety expressed in this thread here. Is there any prior art we can reference? Is the operation model similar to SharedArrayBuffer? Should we consider Atomics as a complementary but necessary capability here?

Making this fully support async JS to native invocation calls will likely increase complexity, as it would require us to keep the JS object alive for the duration of the native memory reference, but not impossible. Alternatively, we'd need to make it really obvious that ArrayBuffer args can only be used in sync calls through codegen.

javache avatar Oct 20 '25 11:10 javache

I agree with the concerns around thread-safety expressed in this thread here. Is there any prior art we can reference?

AFAIK other implementations (such as Nitro Modules or Expo Modules) has similar approach - ownership is not transferred from JS to Native, but borrowed. From what I see thread safety is either manual or ensured by copying the content (e.g. Expo Blob) - I am not aware of the other solution to this problem - happy to be corrected.

Is the operation model similar to Single-ownership model where buffers are either borrowed (JS→Native, sync only) or transferred (Native→JS)? Should we consider Atomics as a complementary but necessary capability here?

The proposed operational model isn't similar to SharedArrayBuffer I would say - there is no synchronisation. It's more single-ownership model where buffers are either borrowed (JS→Native, sync only) or transferred (Native→JS). Can we somehow provide using JSI similar mechanism for safe concurrent access as these primitives offers?

Making this fully support async JS to native invocation calls will likely increase complexity, as it would require us to keep the JS object alive for the duration of the native memory reference, but not impossible. Alternatively, we'd need to make it really obvious that ArrayBuffer args can only be used in sync calls through codegen.

Yes, I can see we can do it either way. Would it also be a applicable solution to the "borrowed" vs "moved" discussion? Instead it can be "shared", by keeping JS object alive. It would not eliminate the thread-safety problems, but perhaps could be another solution to the problem.

paradowstack avatar Oct 22 '25 12:10 paradowstack

Thank you all for the feedback and discussion on this RFC! I've updated the proposal.

There are two open questions and fundamental design decisions that need input:

  1. Transfer vs. Borrowing Semantics The current proposal uses borrowing semantics for JS→Native due to JSI/Hermes API limitations. However, as pointed out, this deviates from web standards and places unusual thread-safety burden on developers. Transfer semantics would solve this but from my understanding requires new JSI/Hermes APIs. Should we proceed with borrowing as a pragmatic interim solution? Do you see the other feasible solutions to the problem?
  2. Thread-Safety Model Related to the above, we need to decide whether to require manual developer coordination or enforce thread-safety through ownership transfer or some synchronisation mechanism.

I'd appreciate guidance from the core team and community on how to proceed with these questions!

paradowstack avatar Oct 22 '25 12:10 paradowstack

@javache did you maybe have time to take a look again at this RFC and open questions once again? 😊

paradowstack avatar Nov 14 '25 12:11 paradowstack

Here are my general thoughts on the matter, and unfortunately I see a couple of fundamental problems (sorry, I should have joined here earlier):

  1. ArrayBuffers are simply not very suitable for passing to asynchronous native code because they can be detached or resized. AFAICT, NodeJS doesn't have any asynchronous APIs that operate on ArrayBuffer precisely for that reason.
  2. ArrayBuffers are also not suitable because technically there is nothing preventing the engine from allocating the buffer contents in the GC heap, thus making it movable.
  3. Using "transfer" semantics when calling native code would be unusual - after all we are not sending the array to another process or runtime. Again, NodeJS doesn't do that. Designing APIs around unusual techniques is questionable.
  4. The threading issue is not a concern. Usually it is not the APIs job to protect a buffer from simultaneous access.

Problem 1 has partial workarounds - you can check whether the buffer is resizable and reject it, but you can't prevent it from becoming asynchronously detached. Problem 2 is basically unsolvable.

This is why NodeJS uses Buffer - it can fully control it.

My recommendation: give up on ArrayBuffer. It simply can never work well because of problem 2. It will always be somewhat of a hack. Use Buffer instead, which is well understood by everybody, compatible with NodeJS, etc, etc.

If you folks are dead-set on ArrayBuffer, I think @mrousavy's proposal for getMutableBuffer() is a solution. If the ArrayBuffer was created by native code as a MutableBuffer, then obtaining a shared_ptr to it gives sufficient guarantees, I think. If it wasn't, it should be rejected at runtime by the API. Personally I don't think that would be a great API, but we are going to add getMutableBuffer() anyway.

tmikov avatar Nov 15 '25 05:11 tmikov