v8 Batching calls within a lock / scope

While I was fiddling with bindings in a different languages, I figured what was expensive was all the locking and scoping for every call. If you can instead callback into go, holding on a lock and/or a scope, you can really speed things up.

My thinking is we can batch all value creation (for structs and objects, etc.) or all logically grouped operations within a isolate lock and/or context scope with this.

Example implementation:

type Context struct {
	// ...
	scopeCallbacks map[int]chan struct{}
}

// ...

var scopeEnds = make(map[int]chan struct{})

//export __go_ctx_scoped
func __go_ctx_scoped(cbIDStr C.String) {
	// Identical to go_callback_handler
	cbID := C.GoStringN(cbIDStr.ptr, cbIDStr.len)
	parts := strings.SplitN(cbID, ":", 2)
	ctxID, _ := strconv.Atoi(parts[0])
	callbackID, _ := strconv.Atoi(parts[1])

	contextsMutex.RLock()
	ref := contexts[ctxID]
	if ref == nil {
		panic(fmt.Errorf(
			"Missing context pointer during callback for context #%d", ctxID))
	}
	ctx := ref.ptr
	contextsMutex.RUnlock()

	ch := ctx.scopeCallbacks[callbackID]
	ch <- struct{}{} // we're ready

	doneCh := scopeEnds[callbackID]
	<-doneCh // hold onto this scope until done!
}

func (ctx *Context) WithScope() chan struct{} {
	ctx.nextCallbackId++
	id := ctx.nextCallbackId
	ch := make(chan struct{}, 1)
	ctx.scopeCallbacks[id] = ch
	cbIDStr := C.CString(fmt.Sprintf("%d:%d", ctx.id, id))
	defer C.free(unsafe.Pointer(cbIDStr))

	doneCh := make(chan struct{}, 1)
	scopeEnds[id] = doneCh

	go func() {
		// Run this async, because it blocks until out of scope
		addRef(ctx)
		C.v8_with_ctx_scope(ctx.ptr, cbIDStr)
		decRef(ctx)
	}()
	<-ch
	// This means we're all scoped and ready
	return doneCh
}

func (ctx *Context) NewObjectWScope() *Value {
	return ctx.newValue(C.v8_object_new(ctx.iso.ptr), C.kObject)
}

func (ctx *Context) NewObjectWoutScope() *Value {
	return ctx.newValue(C.v8_Object_New(ctx.ptr), C.kObject)
}

func (v *Value) SetWithScope(name string, value *Value) error {
	name_cstr := C.CString(name)
	errmsg := C.v8_object_set(v.ctx.ptr, v.ptr, name_cstr, value.ptr)
	C.free(unsafe.Pointer(name_cstr))
	return v.ctx.iso.convertErrorMsg(errmsg)
}

The C++ used:

void __go_ctx_scoped(String id);

void v8_with_ctx_scope(ContextPtr ctxptr, char* id) {
  VALUE_SCOPE(ctxptr);
  __go_ctx_scoped(DupString(id));
}

PersistentValuePtr v8_object_new(IsolatePtr iso) {
  auto isolate = static_cast<v8::Isolate*>(iso);
  // fprintf(stderr, "obj new iso: %p\n", isolate);
  v8::Local<v8::Object> obj = v8::Object::New(isolate);
  // fprintf(stderr, "obj new, done\n");
  return new Value(isolate, obj);
}

PersistentValuePtr v8_Object_New(ContextPtr ctxptr) {
  VALUE_SCOPE(ctxptr);
  v8::Local<v8::Object> obj = v8::Object::New(isolate);
  return new Value(isolate, obj);
}

Error v8_object_set(ContextPtr ctxptr, PersistentValuePtr valueptr, const char* field, PersistentValuePtr new_valueptr) {
  auto isolate = static_cast<Context*>(ctxptr)->isolate;
  auto ctx = static_cast<Context*>(ctxptr)->ptr.Get(isolate);
  Value* value = static_cast<Value*>(valueptr);
  v8::Local<v8::Value> maybeObject = value->Get(isolate);
  if (!maybeObject->IsObject()) {
    return DupString("Not an object");
  }

  // We can safely call `ToLocalChecked`, because
  // we've just created the local object above.
  v8::Local<v8::Object> object =
      maybeObject->ToObject(ctx).ToLocalChecked();

  Value* new_value = static_cast<Value*>(new_valueptr);
  v8::Local<v8::Value> new_value_local = new_value->Get(isolate);
  v8::Maybe<bool> res =
    object->Set(ctx, v8::String::NewFromUtf8(isolate, field), new_value_local);

  if (res.IsNothing()) {
    return DupString("Something went wrong -- set returned nothing.");
  } else if (!res.FromJust()) {
    return DupString("Something went wrong -- set failed.");
  }
  return (Error){nullptr, 0};
}

Benchmarks:

func BenchmarkNewObjectWithoutScope(b *testing.B) {
	ctx := NewIsolate().NewContext()
	b.ResetTimer()
	for n := 0; n < b.N; n++ {
		ctx.NewObjectWoutScope()
	}
}

func BenchmarkNewObjectWithScope(b *testing.B) {
	ctx := NewIsolate().NewContext()
	ctx.WithScope() // don't need the chan here
	b.ResetTimer()
	for n := 0; n < b.N; n++ {
		ctx.NewObjectWScope()
	}
}

Results:

$ go test -benchmem -run="^$" -bench "^BenchmarkNewObject"
goos: darwin
goarch: amd64
pkg: github.com/augustoroman/v8
BenchmarkNewObjectWithoutScope-4   	 1000000	      7915 ns/op	      32 B/op	       1 allocs/op
BenchmarkNewObjectWithScope-4      	 2000000	       972 ns/op	      32 B/op	       1 allocs/op
PASS
ok  	github.com/augustoroman/v8	13.814s

Example usage:

package main

import (
	"fmt"

	"github.com/augustoroman/v8"
)

func main() {
	ctx := v8.NewIsolate().NewContext()
	done := ctx.WithScope()
	obj := ctx.NewObjectWScope()
	obj.SetWithScope("hello", ctx.NewObjectWScope())
	done <- struct{}{}
	b, _ := obj.MarshalJSON()
	fmt.Println(string(b)) // {"hello": {}}
}

edit: posted comment too quick by mistake, added code and examples.

May 13 '18 17:05 jeromegn

For some reason, this doesn't work with all kinds of values... or something. I'm trying to make the whole ctx.Create use one handle scope, but started getting fun errors:

#
# Fatal error in HandleScope::HandleScope
# Entering the V8 API without proper locking in place
#

SIGILL: illegal instruction
PC=0x4a37942 m=0 sigcode=1

goroutine 0 [idle]:
runtime: unknown pc 0x4a37942
stack: frame={sp:0x7fff5fbfecf8, fp:0x0} stack=[0x7fff5fb7fa10,0x7fff5fbfee90)
00007fff5fbfebf8:  0000000006000000  00007fff5fbfed30
00007fff5fbfec08:  0000000000000000  00007fff5fbfecf0
00007fff5fbfec18:  0000000004a38010  000001da54b03389

// ...

^ This was while trying to create a Function (using a Bind function I made that doesn't acquire the scope), but I've seen it with a Date object too.

May 13 '18 20:05 jeromegn

I was wrong, apparently that doesn't keep the locker or any other scope alive. It appears you don't need the locker or anything of the sort to create all kinds of values except Date and Function.

I don't think I can prevent the locker and scope destructors from firing when calling a different function. My C++ is pretty awful!

May 14 '18 13:05 jeromegn

Update time: I was getting the scope errors because of switching threads.

I made a handleScope which essentially queues functions to run in scope. The go ctx scope callback locks the OS thread and waits for "jobs" to run.

type handleScope chan func()

func (hs handleScope) Do(f func()) {
	done := make(chan struct{}, 1)
	hs <- func() {
		f()
		done <- struct{}{}
	}
	<-done
}

func (hs handleScope) End() {
	close(hs)
}

Example usage:

func (ctx *Context) Create(val interface{}) (*Value, error) {
	scope := ctx.WithScope()
	var v *Value
	var err error
	scope.Do(func() {
		v, _, err = ctx.create(reflect.ValueOf(val))
	})
	scope.End()
	return v, err
}

This, is not super fast for 1 value. Hell, it's slower and allocates a bit more.

However, for a few values, it makes things faster.

func BenchmarkNewObjectWithoutScope(b *testing.B) {
	ctx := NewIsolate().NewContext()
	b.ResetTimer()
	for n := 0; n < b.N; n++ {
		ctx.NewObjectWoutScope()
		ctx.NewObjectWoutScope()
		ctx.NewObjectWoutScope()
		ctx.NewObjectWoutScope()
		ctx.NewObjectWoutScope()
		ctx.NewObjectWoutScope()
	}
}

func BenchmarkNewObjectWithScope(b *testing.B) {
	ctx := NewIsolate().NewContext()
	scope := ctx.WithScope() // don't need the chan here
	todo := func() {
		ctx.NewObjectWScope()
		ctx.NewObjectWScope()
		ctx.NewObjectWScope()
		ctx.NewObjectWScope()
		ctx.NewObjectWScope()
		ctx.NewObjectWScope()
	}
	b.ResetTimer()
	for n := 0; n < b.N; n++ {
		scope.Do(todo)
	}
}

$ go test -benchtime=10s -benchmem -run="^$" -bench "^BenchmarkNewObject"
goos: darwin
goarch: amd64
pkg: github.com/augustoroman/v8
BenchmarkNewObjectWithoutScope-4   	 1000000	     65705 ns/op	     192 B/op	       6 allocs/op
BenchmarkNewObjectWithScope-4      	 1000000	     20600 ns/op	     320 B/op	       8 allocs/op
PASS
ok  	github.com/augustoroman/v8	89.272s

There is a lot more lock contention with the scoped version. There's a for range on a channel that is single-threaded waiting for data.

For my use cases, this is pretty interesting. I have complex objects to pass and as soon as there are 2-3 sub-values needing creation, it can get pretty slow.

I can push to a branch if you're interested in seeing what's happening.

May 14 '18 16:05 jeromegn

Ok, I woke up realizing that benchmark was unfair. I was not getting and putting the scope back during the benchmark, which the other one does.

With that in place, ns/op is very similar, but total ops is almost 10x lower with the batch mode. Which surprises me a lot! It is technically a few more cgo calls, but all in all, 2x less time is spent in cgo calls with the batch mode. It's spent elsewhere though, usually in runtime. namespaced ops.

It's a bit hard to know which one is better in this very specific benchmark. I would think it'd be faster to prevent lock and scope changing threads all the time.

May 15 '18 13:05 jeromegn

v8 v8 copied to clipboard

Batching calls within a lock / scope

v8
v8 copied to clipboard