hermes Expose a way to create a JSI HernesRuntime instance from an existing Hermes VM instance

Summary

I understand this change might be a bit contentious, but please hear me out.

Our Hermes integration does not use the JSI API, we integrate with the Hermes Runtime C++ API directly. There are a few reasons for this:

We already have an API that is similar to JSI for which we have 4 different JS engines integrated with it. If we were integrating Hermes through JSI we would be implementing an abstraction on top of an abstraction, which adds performance and memory overhead. We want our Hermes integration to be as fast as it can be, and for that we need our bridge layer to have minimal overhead so that it can compete on a performance level against our other JS engine integrations that are also based on lower level JS engine primitives.
There are APIs we need to integrate that can't be implemented with JSI.
We don't have C++ exceptions enabled, and the JSI Hermes integration makes heavy use of them and forces its consumers to handle errors through exceptions.

The Hermes Debugger implementation is however based on the JSI API, so in order for us to leverage the Hermes Debugger, this change adds a small shim that can create a JSI Runtime instance from an existing Hermes Runtime C++, so that we can pass it to the Debugger. This makes it possible for us to keep the rest of our integration intact, and just call hermes::adoptHermesRuntime() on debug builds to create a debugger instance. Through that system, the linker is able to strip out all the JSI code for us on release builds.

Test Plan

Jun 05 '24 23:06 rFlex

Hey @rFlex, thanks for putting this up and sharing that context. As you probably know, the Hermes internal APIs are unstable and unsafe, so we deliberately do not expose them. That said, I understand the issues you're describing, and that this solves a legitimate problem for you.

In terms of this PR, I think it would be preferable to expose something like vm::Runtime *getVMRuntimeUnsafe() on HermesRuntime, rather than allowing you to construct a HermesRuntime from an existing vm::Runtime. Consolidating the ownership of the runtime in HermesRuntime makes it easier to manage the lifetimes correctly here. In particular, we want to avoid a situation where HostFunctions and HostObjects retaining jsi::Values remain alive through the vm::Runtime after the corresponding HermesRuntime has been freed. Given that the lifetimes of these are managed by the GC, the safest thing to do is to ensure that the vm::Runtime is itself owned by the HermesRuntime, and therefore cannot outlive it.

I would like to hear more on your concerns though. Getting feedback from engaged Hermes integrators is extremely useful for us.

Have you benchmarked the relative performance of JSI against the internal APIs? It would be interesting to hear if there are scenarios where you have found the JSI overhead unacceptable, and if there are specific operations you found responsible. It would also be useful to know when you benchmarked (since Hermes' JSI performance has improved considerably in the last two years).
What things have you tried implementing that weren't possible in JSI? We're open to improving and extending JSI to support more use cases.
This is understandable. We are working on adding a stable C-based API to Hermes that is designed to serve as an intermediate layer between Hermes and JSI. The current version is in API/hermes_abi. This would address your concerns about exceptions, but is intended to be close to JSI in terms of performance and functionality.

Jun 07 '24 00:06 neildhar

Hey @neildhar ,

The reason we had it as adoptHermesRuntime() instead of makeHermesRuntime() + getVMRuntimeUnsafe() is that it allowed us to create the Hermes runtime the same way on prod + debug build, it's just that on debug build we also used adoptHermesRuntime() to provide an instance to the debugger. We have libcxx rtti disabled for app size reasons, which JSI requires, so we kept rtti enabled for debug builds just for being able to use JSI + Hermes Debugger, and we keep rtti disabled for prod build. That means we don't have access to the JSI classes on prod builds for us today. We could however use makeHermesRuntime() + getVMRuntimeUnsafe() on debug builds and create the Hermes C++ runtime ourselves on prod builds, not the biggest deal in the world. Happy to make the change.

Answers to your questions:

There were no benchmarks, it is possible that the performance would have been fine. It's just that the set of potential problems we would face would make it costly to attempt to integrate with JSI. If the other two points were not a problem, we would have likely gone that route as it'd have been cheaper to do it that way, and move to the lower level API only if we found performance problems. Since there was some risks involved, we decided to integrate with the lower level API directly, which eliminates the possibility of the performance overhead of the abstraction being a problem entirely. The JSI API is not a zero overhead abstraction, and our JS API is not as well. This was a preemptive optimization that only made sense because it was cheaper to go with the optimized route (use the lower level API) rather than the unoptimized route (use the JSI API).
Some of the things that I can remember, and let me know if I missed that there are ways to do this with JSI:

Compiling JS into JS bytecode and get the output as bytes, at runtime as well as ahead of time without using the hermesc binary. We use this for ahead of time bytecode compilation, and we also have a bytecode cache at runtime where we can compile js into bytecode in the background and store it on disk. This reduces the app size cost of modules as Hermes bytecode is quite large. The proposed way by Hermes on how to compile and deal with bytecode limits our options here unless I'm missing something.
Enabling Hermes's rejection tracker system, which we have to resolve the HermesInternal object and call enablePromiseRejectionTracker on it.
Manipulating strings efficiently (creating a string from UTF16 without converting it first to UTF8, avoid C++ std::string conversions) I am likely missing some things, the Hermes integration was done in Q4 last year.

Yes I did notice that the C API was in progress. It's likely we would have used it if was available at the time of the integration.

I think we are unlikely to be representative of a typical Hermes user.

Jun 07 '24 14:06 rFlex

Thanks for the detailed response! Regarding the the gaps you identified in JSI:

Compiling JS to bytecode is indeed not available in JSI. But you can compile with our internal APIs, or using the public (but more restricted) compileJS API, and feed the result into JSI through evaluateJavaScript. That is, using the internal compiler APIs is orthogonal to using JSI.
Accessing the properties on HermesInternal can be done in JavaScript, or through JSI by getting the HermesInternal property off the global object. What issues did you run into doing this?
String conversion is definitely a gap. We intend to support 16 bit characters in JSI, which will incur less overhead, but we don't intend to support direct access to the bytes in the Hermes heap (which would be very unsafe).

Jun 07 '24 23:06 neildhar

@neildhar great callouts, we could have used the compiler APIs for bytecode compilation independently of JSI. I thought that HermesInternal was a private symbol and not available to user code, but reading the code that seems completely wrong and I don't remember how I got that assumption.

Jun 10 '24 14:06 rFlex

@neildhar changed to expose underlying instance of from the JSI runtime instance

Jun 10 '24 14:06 rFlex

hey @neildhar , anything else I should update? could we get this merged?

Jul 26 '24 14:07 rFlex

@rFlex Just the title and the summary need to be updated, everything else looks good

Jul 26 '24 15:07 neildhar

@neildhar updated!

Aug 08 '24 13:08 rFlex

@neildhar can we get this merged? thanks!

Aug 13 '24 15:08 rFlex

@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Aug 13 '24 16:08 facebook-github-bot

@neildhar merged this pull request in facebook/hermes@880b1645b5dca974f4329dc4108692d301abee0d.

Aug 13 '24 22:08 facebook-github-bot

We already have an API that is similar to JSI for which we have 4 different JS engines integrated with it

Off-topic, but I'm curious what 4 engines you have abstracted, and why? 😄

Aug 15 '24 17:08 mrousavy