nan icon indicating copy to clipboard operation
nan copied to clipboard

[discussion] integrating nan or similar with io.js

Open trevnorris opened this issue 9 years ago • 101 comments

nan has done wonders for the community, but there are still missing bits. If these can be addressed there is a good desire to bring this into io.js itself. This thread should serve as a discussion on what are all the missing bits, what we can do about them and how to eventually get this into io.js.

The most prominent issue is lack of ABI compatibility. Needing to recompile with every new release, especially now with more frequent V8 updates, is difficult. @bnoordhuis mentioned that using templates is not a good form of ABI compatibility because they don't play well with the linker. (Ben, if you have any additional thoughts on this please feel free to include anything I've missed).

trevnorris avatar May 13 '15 21:05 trevnorris

I'll start by noting that not using templates leads to very clumsy code, and since V8 makes heavy use of templates, this would require not directly exposing anything from V8. Otherwise the issue of ABI compatibility will reemerge.

kkoopa avatar May 13 '15 23:05 kkoopa

Some initial randomish thoughts:

  1. This is a separate effort to NAN and will be until it can completely replace NAN and we can deprecate it - even then we'll probably have overlap while people adjust
  2. Let's start simple and unambitious
  3. We're going to have to accept a world similar to NAN where the API/ABI will have to change over time to adapt to a changing upstream API but the benefit will be that you're dealing with a single API to code against
  4. People are going to want to reach around to V8 regardless of how sophisticated this is, we need to accept that and possibly make allowances for it, the problem with https://github.com/tjfontaine/node-addon-layer as I see it is that it goes too far in one leap and makes no allowances for anything but standard uses (the typical "framework problem"); this and item (2) above are the main reasons I've advocated for the NAN approach over the addon-layer approach in the past
  5. The continued instability from V8 is getting me down, they seem to have an endless appetite to screw with their API and therefore screw users of the C++ API, maybe it's the frustration with having to deal with permanent backward compatibility in JS to ES1 that forces them to take it out on the C++ API. @domenic says he's going to take this up with the V8 team but at this stage I'm much more willing to support efforts to detach Node from V8 as a direct dependency.
  6. Chakra introduces an interesting dynamic here and we may want to engage the Microsoft team on this effort given what they've had to go through to make their shim

rvagg avatar May 14 '15 04:05 rvagg

Three things I believe could be done without templates and be ABI compliant:

  1. Local<Value> with a few methods. Such as Is{Undefined,Null,True,etc.}(), BooleanValue(), NumberValue(), etc. The hurdle to overcome would be casting to other types, and supporting all those APIs.
  2. typedef void (*FunctionCallback)(const FunctionCallbackInfo<Value>& info) could be a normal function call. info[i] could be an array of values, and since they're all initially Local<Values> the naive implementation would be supported. The problematic would be all calls that return Local<Object>.
  3. Buffers are pretty straight forward. This should even be possible using our Local<Value> abstraction, and in fact would be most compatible that way because currently it requires a Local<Object> and in the future will require a Local<Uint8Array>. Possibly something like the following (warning, this is very C-ish):
struct node_buffer_s {
  char* data;
  size_t length;
};
typedef void node_buffer_s node_buffer_t;
void FnCallback(const CallbackInfo* info) {
  node_buffer_t buf;
  Node::Buffer::GetData(info[0], &buf);
}

trevnorris avatar May 14 '15 16:05 trevnorris

Would definitely want a code generator to generate the code for all the ensuing virtually identical wrapper functions with different types. It would be nice if one could get at the resulting C++ code after template processing, but before compilation.

Have a look at how Python 3 is doing it. Python is also dynamically typed, and everything is an object, but I have to say that writing addons for it is rather painful.

Also, I think the bar for a useful addon API should be set higher than trivial. If you can only do trivial things, you might as well just set up a socket or pipe of sorts and communicate between JavaScript and the real world through that.

kkoopa avatar May 14 '15 17:05 kkoopa

@kkoopa

Also, I think the bar for a useful addon API should be set higher than trivial.

I was just echoing @rvagg's sentiment of:

Let's start simple and unambitious

So to clarify I was simply saying we start small, experiment and see how well it works. If it works at all.

One important thing is that if we want to truly achieve ABI compatibility then the exposed API needs to be written in C. There have been more than a few people smarter than myself that have expressed that C++ doesn't do well at maintaining proper memory structures when compiled with different compilers, or even different versions of the same compiler.

trevnorris avatar May 14 '15 19:05 trevnorris

I agree that we have to start simple, implementation-wise. However, that initial implementation needs to scale up to offering a richer feature set.

It does not need to be written in C as such, it's just the external functions (and data) that need to be "C" exported. Without C++, how would the the wrappers interact with v8?

kkoopa avatar May 14 '15 19:05 kkoopa

It does not need to be written in C as such, it's just the external functions (and data) that need to be "C" exported.

Agreed. I should have made my following statement more clear:

One important thing is that if we want to truly achieve ABI compatibility then the exposed API needs to be written in C.

So yes. I completely agree. It's just the user facing API that should be C.

trevnorris avatar May 14 '15 19:05 trevnorris

Something like this should do. https://github.com/martine/v8c/blob/v8c/src/v8c.cc

On May 14, 2015 10:46:57 PM EEST, Trevor Norris [email protected] wrote:

It does not need to be written in C as such, it's just the external functions (and data) that need to be "C" exported.

Agreed. I should have made my following statement more clear:

One important thing is that if we want to truly achieve ABI compatibility then the exposed API needs to be written in C.

So yes. I completely agree. It's just the user facing API that should be C.


Reply to this email directly or view it on GitHub: https://github.com/iojs/nan/issues/349#issuecomment-102147935

kkoopa avatar May 14 '15 19:05 kkoopa

@kkoopa good stuff. we'd just have to extend it to support current APIs like Buffer.

trevnorris avatar May 14 '15 20:05 trevnorris

And new V8. Notice that this project has been abandoned for 6 years.

On May 14, 2015 11:00:11 PM EEST, Trevor Norris [email protected] wrote:

@kkoopa good stuff. we'd just have to extend it to support current APIs like Buffer.


Reply to this email directly or view it on GitHub: https://github.com/iojs/nan/issues/349#issuecomment-102151275

kkoopa avatar May 14 '15 20:05 kkoopa

One important thing is that if we want to truly achieve ABI compatibility then the exposed API needs to be written in C. There have been more than a few people smarter than myself that have expressed that C++ doesn't do well at maintaining proper memory structures when compiled with different compilers, or even different versions of the same compiler.

Structures/classes don't have to be a problem when you don't use features such as virtual methods, multiple inheritance, member function pointers, etc.

I suppose name mangling could be an issue for people that mix C++11 code with pre-C++11 code. You can work around it by building everything with -fabi-version=6 but that requires that you use a recent compiler.

bnoordhuis avatar May 14 '15 20:05 bnoordhuis

So, what we'd essentially want is to wrap v8's C++ exports in C exports to get a stable ABI and then have a (NAN-derived) header library on top of that which would restore the C++ nature of V8's API by mapping to the C exports.

Something like this:

extern "C" {
  V8StringUtf8 v8_string_utf8_new(const char* data, int length);
  V8Number v8_number_int32_new(int32_t val);
  V8Number v8_number_uint32_new(uint32_t val);
}
template<v8::String>
v8::Local<v8::String> NanNew(const char *data, int length = -1) {
  return v8_string_utf8_new(data, length);
}

template<v8::Int32>
v8::Local<v8::Int32> NanNew(int32_t val) {
  return v8_number_int32_new(val);
}

template<v8::UInt32>
v8::Local<v8::UInt32> NanNew(uint32_t val) {
  return v8_number_uint32_new(val);
}

kkoopa avatar May 14 '15 22:05 kkoopa

Microsoft appears to have created a V8 compatible C++ API facade for Chakra. While it is a nice piece of engineering, it also has the disadvantage of trying to keep up with a constantly changing V8 API that is not managed by the node project.

I like the javascript engine wrapper C API idea proposed by @trevnorris. Node itself and node native modules could only use this C API. That way the javascript engine (V8, Spidermonkey, Duktape, JavascriptCore or other) could be made to be pluggable at runtime depending on the needs of the user. Native node modules compiled against this hypothetical C API could work against any such runtime-pluggable engine without recompilation due to its more stable ABI.

However, one downside with such a javascript engine C API would be that it would impose a little bit of overhead on every javascript <=> c++ operation as compared to header-only inline C++ classes or C macros. Even so, I still think it is worth the cost for the maintainability and the flexibility it would offer.

kzc avatar May 15 '15 05:05 kzc

@kzc After doing testing in this area I'm fairly confident that we'd be able to do this with very little overhead. At least minimal enough where users wouldn't feel it in their JS code.

trevnorris avatar May 15 '15 10:05 trevnorris

Clearly yes. The overhead of context switching from JavaScript to C++ drowns out everything else.

On Friday 15 May 2015 03:06:11 Trevor Norris wrote:

@kzc After doing testing in this area I'm fairly confident that we'd be able to do this with very little overhead. At least minimal enough where users wouldn't feel it in their JS code.


Reply to this email directly or view it on GitHub: https://github.com/iojs/nan/issues/349#issuecomment-102354581

kkoopa avatar May 15 '15 10:05 kkoopa

I wouldn't recommend having the proposed C API functions and types mirror v8's C++ API and its concepts, because you would still have the original problem of tightly coupling node against an ever changing v8. Ideally the proposed C API should work equally well with Spidermonkey which cares about registering GC roots and Duktape which uses a Lua-style sandbox approach where you only manipulate copies of the actual data structures. But I recognize that such portability would impose a runtime cost.

kzc avatar May 15 '15 14:05 kzc

On the contrary, I would recommend having it mirror V8's C++ API. Abstracting away the engine in a pluggable way is too much unnecessary work. It is not like io.js core would use these bindings internally, so it still would not become Spidernode. Personally, I have zero interest in supporting other JavaScript engines than V8.

kkoopa avatar May 15 '15 14:05 kkoopa

Perhaps I mistook @trevnorris' proposal. I thought both node itself and native node modules would be changed to only use this new C API to isolate them both from the javascript engine's API.

If the goal is to only support v8, then I don't think the proposed C API buys you much. By closely mirroring the C++ v8 API any significant upstream change would still require an equivalent change in the C API.

kzc avatar May 15 '15 15:05 kzc

@kzc The gain for maintaining a public C API is that native modules will not have to be recompiled every time they upgrade their version of node.

@kkoopa Unfortunately using this API internally is exactly what large businesses want. Microsoft has already done a port to Chakra, Oracle and IBM are independently working to get node running on Java. Anyway, just giving you a heads up that this is exactly what these large companies are after.

trevnorris avatar May 18 '15 16:05 trevnorris

Oh, really? Why would they want that? What, exactly, are we discussing here? Is it making an ABI-stable API for modules or is it rewriting all of Node to abstract away V8? The former is a lot less work than the latter.

kkoopa avatar May 18 '15 16:05 kkoopa

They want the ability to use whatever VM under the hood they deem appropriate. But then also have the user facing API ABI stable so they can ship pre-compiled modules. I'm only relaying this information so you're aware, and I dread to think what would happen to core in an attempt to do this.

The reason I opened this was to discuss the user-facing part. Creating ABI stability for native modules so they don't have to be recompiled with every release.

trevnorris avatar May 18 '15 17:05 trevnorris

Good, then we're on the same page. In that case, I think the best to do is the following:

Take (almost) all the functionality already exposed by NAN (diregarding sugar, 96 % of it is used in native modules in the wild). This is the least a usable API should offer to support all the existing native modules.

Create C-exported wrappers for all the necessary V8 functions. Naming should reflect the original API, so let's use some arbitrary, but consistent, sort of mangling system through underscores. The v8 namespace may be considered implicit. We only expose the longest function signature when there are convenience overloads or default argument values.

Local__String__ String_NewFromUtf8(Isolate *, const char *str, int length, ...) {...}
Local__String__ String_NewFromTwoByte(Isolate *, const uint16_t *str, int length, ...) {...}
...

Then, as I mentioned in an earlier post, we could stick a header-only C++ API back on top, e.g. NAN or derivative. This library would circumvent the clumsiness and loss from having C-bindings, giving the best of two worlds. A C++ API with consistent C ABI. The naming scheme for C-exported functions can be arbitrarily clumsy, yet still hidden away.

All of this even seems like it would be possible to automate, i.e. Write a program that parses v8.h, getting all public functions and spits out a bunch of horribly named C function declarations and a C++ header that reconstructs the C++ API via the C functions. This would be cool, but probably not an easy task.

kkoopa avatar May 18 '15 17:05 kkoopa

Sounds like a good enough approach to me. @bnoordhuis thoughts?

trevnorris avatar May 18 '15 17:05 trevnorris

Would still need to export some functions from Node too, buffer, objectwrap, strings, etc. Then have som representation of exported classes as structs (they are rare,but a couple exist).

On May 18, 2015 8:55:18 PM EEST, Trevor Norris [email protected] wrote:

Sounds like a good enough approach to me. @bnoordhuis thoughts?


Reply to this email directly or view it on GitHub: https://github.com/nodejs/nan/issues/349#issuecomment-103148532

kkoopa avatar May 18 '15 18:05 kkoopa

Something to consider: V8 uses RAII in a few places to good effect, c.f. HandleScope. A C shim would be quite error prone to use.

Exposing structs in the API might make ABI stability pretty complicated. (If I ever get around to writing a blog post "Libuv, lessons learned", that's going to feature prominently.) I would suggest using opaque pointers, coupled with accessors.

bnoordhuis avatar May 18 '15 18:05 bnoordhuis

Can you give some concrete examples of what you had in mind, e.g. HandleScope?

On May 18, 2015 9:49:15 PM EEST, Ben Noordhuis [email protected] wrote:

Something to consider: V8 uses RAII in a few places to good effect, c.f. HandleScope. A C shim would be quite error prone to use.

Exposing structs in the API might make ABI stability pretty complicated. (If I ever get around to writing a blog post "Libuv, lessons learned", that's going to feature prominently.) I would suggest using opaque pointers, coupled with accessors.


Reply to this email directly or view it on GitHub: https://github.com/nodejs/nan/issues/349#issuecomment-103170593

kkoopa avatar May 18 '15 18:05 kkoopa

@trevnorris, I agree that a C API wrapper makes sense if you intend to support javascript engines other than just v8. Just be aware that there are features and concepts in the v8 API that don't easily map to other javascript engines, and vice versa. Mirroring the v8 C++ API in C may create more work for the other engine mappings.

kzc avatar May 18 '15 18:05 kzc

@kzc At this point I'm neither for or against mirroring the V8 API. Right now I'm just trying to facilitate conversation until we can achieve some actionable items.

trevnorris avatar May 18 '15 18:05 trevnorris

Can you give some concrete examples of what you had in mind, e.g. HandleScope?

I think that's a question for me? Well, I imagine that C++ code like this:

void f(v8::Isolate* isolate) {
  v8::HandleScope handle_scope(isolate);
  // ...
  if (g()) return;
  // ...
}

Would end up looking something like this in C:

void f(v8_isolate *isolate) {
  v8_handlescope_enter(isolate);
  // ...
  if (g()) {
    v8_handlescope_exit(isolate);
    return;
  }
  // ...
  v8_handlescope_exit(isolate);
}

Manually having to balance enter/exit calls everywhere is a bit of a pain and easy to get wrong.

bnoordhuis avatar May 18 '15 19:05 bnoordhuis

@bnoordhuis so the best solution would be to still use a C++ interface but stay away from (as you mentioned) virtual methods, multiple inheritance, member function pointers, etc?

trevnorris avatar May 18 '15 19:05 trevnorris