v8go icon indicating copy to clipboard operation
v8go copied to clipboard

V8 values returned by v8go are not garbage collected by V8

Open arctica opened this issue 3 years ago • 52 comments

Using the print function from the examples (even with the actual fmt.Printf() commented out) seems to result in a memory leak.

Slightly modified example from the Readme to call the print function in an infinite loop:

iso, _ := v8go.NewIsolate() // create a new VM
// a template that represents a JS function
printfn, _ := v8go.NewFunctionTemplate(iso, func(info *v8go.FunctionCallbackInfo) *v8go.Value {
    //fmt.Printf("%v", info.Args()) // when the JS function is called this Go callback will execute
    return nil // you can return a value back to the JS caller if required
})
global, _ := v8go.NewObjectTemplate(iso) // a template that represents a JS Object
global.Set("print", printfn) // sets the "print" property of the Object to our function
ctx, _ := v8go.NewContext(iso, global) // new Context with the global Object set to our object template
ctx.RunScript("while (true) { print('foo') }", "print.js") // will execute the Go callback with a single argunent 'foo'

Warning: Running this will quickly consume all memory of the machine.

Removing the 'foo' parameter of the print() call seems to stop the leak. So I guess the parameters of the callback are somehow leaking.

arctica avatar Apr 13 '21 20:04 arctica

I think v8go.NewValue() also leaks memory, even if you don't pass it back as a result from the Go callback.

iso, _ := v8go.NewIsolate()
fn, _ := v8go.NewFunctionTemplate(iso, func(info *v8go.FunctionCallbackInfo) *v8go.Value {
    v8go.NewValue(iso, "foo")
    return nil
})

global, _ := v8go.NewObjectTemplate(iso)
global.Set("fn", fn)
ctx, _ := v8go.NewContext(iso, global)
ctx.RunScript("while (true) { fn(); }", "fn.js")

arctica avatar Apr 13 '21 20:04 arctica

New Values are attached to the Isolate that created them, you need to Dispose the isolate to free any value's memory. For the Function the Values should be attached to the Context that created them, so you need to call ctx.Close() to free the memory.

Maybe we need an explicit call to free the values, if you want to free the memory without closing or disposing of the isolate or context? 🤔

rogchap avatar Apr 14 '21 20:04 rogchap

It's impractical to free their memory only when the isolate or context is closed. Not every VM is short lived. In my case I have a bunch of very long running isolates/contexts that are fed new values from Go as events arrive.

I see two ways to do this:

  1. Copy the values somehow into v8 and let it manage those with its GC.
  2. Manually free the memory of the Value in Go as you suggested but this can be tricky as one has to be careful about the lifetime of Values and might even lead to use-after-free security vulnerabilities from inside the JS VM if not done correctly. Plus Go and JS being garbage collected makes manual memory management feel out of place.

Maybe it's worth checking how nodejs handles this?

arctica avatar Apr 15 '21 01:04 arctica

We previously had finalizers set for values so they could be GC'd by Go and call C.free(), but there were issues where the GC would not GC the values because it was unaware of the memory allocated to the Value's value on the C side (the value struct on the Go side only holds pointers so is relatively small), and may also try to GC after the context was closed, therefore trying to free values that have already been freed. There is no way (that I know) of syncing the Go GC with V8's GC

The core problem is actually how CGO works, and the only way to hold a reference to a value "across the bridge" is to have a persistent Value and use a struct to hold that pointer; a persistent value needs to be manually freed. Yes Go has GC and V8 has it's own GC for JS objects, but C/C++ does not.

Nodejs does not have this issue because Nodejs is built in C++ and no bridge to cross aka CGO; it does not need to create persistent values and can deal with just local scoped values.

Rust has better semantics to interface with C libraries and share memory space as it also is not a GC language and is closer to C++; hence why Deno was built with Rust and dropped the initial implementation that was written in Go.

TLDR; when dealing with CGO, it can be unavoidable to to have to manually free objects in some way.

rogchap avatar Apr 15 '21 02:04 rogchap

I've started reading the v8 API a bit and about Values, Local vs Persistent handles etc.

What I don't understand yet is why calling Reset() in the finalizer of a Value from the Go side shouldn't do the trick. With Reset() we just tell v8 we are not interested in it anymore from Go. As long as a Persistent handle is held in Go, v8's GC wont free it and as long as a reference to the Value exists in Go, Go's GC wont call the finalizer which would Reset() the value. It sounds a bit strange that the Go GC wouldn't collect Values "because it was unaware of the memory allocated to the Value's value on the C side". The Go GC ignores C pointers so the only issue can be that it collects a Go Value while on the C side things survive but that's why there are finalizers. Your point about the Go GC freeing a Value after the Context was already closed though makes sense since the order of garbage collection can't be predicted. But the simple solution for that would be to just check if the context was closed. Or you could keep a reference to the Context inside the Value to ensure the Context is not garbage collected if any Values are still alive. I think the issue arises due to a mismatch of manual memory management for Contexts (.Close()) and automatic memory management for Values. The two are closely connected in usage but require different handling. Actually it seems that handles in v8 are tied to Isolates and not Contexts.

I don't think it has much to do with how cgo works or C vs C++ vs Rust. Node written in C++ also needs to use Persistent handles if the references are to survive the local scope. And v8go could make use of Local handles for example when passing arguments from Go to a JS function.

I also see a bit of what looks like unnecessary copying of data. Take for example Value.String() which calls C.ValueToString() which will malloc a copy of the string (just to free it shortly after) then calls C.GoString() which in turn again copies the string. Here one could get the char* from the v8::String::Utf8Value and call C.GoString() on that, skipping that intermediary copy.

I could be wrong with any of the above because I'm just getting into this codebase as well as v8's so please correct me in case I'm misunderstanding things.

arctica avatar Apr 15 '21 04:04 arctica

Hey @arctica. Since you have a reproduction, have you been able to test the changes you've proposed to v8go to see if any resolve the issue?

genevieve avatar Jun 16 '21 20:06 genevieve

Hey @arctica. Since you have a reproduction, have you been able to test the changes you've proposed to v8go to see if any resolve the issue?

Hi @genevieve. Yes I have indeed. First I did a few changes but then realized I wanted to make quite a lot of fundamental changes and pretty much did a rewrite. My version is fully garbage collected (no need to manually .close() a context) via the Go GC. I deviated from v8go's goal to replicate the V8 API and went for something with simpler surface area but easy to understand and use with less pitfalls. I don't think it's in a state to be published though. Maybe if there's some interest...

arctica avatar Jun 16 '21 20:06 arctica

@arctica Yes! We'd be interested in seeing your implementation if you're able to publish it.

genevieve avatar Jun 16 '21 20:06 genevieve

I don't think it's in a state to be published though.

You can open it as a draft PR to indicate that it isn't quite ready as well as letting us know what you still think is needed. Let us know if you want some help with getting it ready for merging

dylanahsmith avatar Jun 16 '21 21:06 dylanahsmith

@genevieve @dylanahsmith Alright, I'll see what I can do over the coming days. Will post again once something is up. Probably not before next week though as I have a few other things to do.

arctica avatar Jun 16 '21 21:06 arctica

Hey Everyone; just a quick update; I've been thinking long and hard about this problem, and I've got a few ideas on how to solve this. I'll be testing out my theory and post any updates here. 🤞

rogchap avatar Jun 18 '21 22:06 rogchap

Update:

I've changed the tracked values from a vector to an unordered_set this way we can check to see if the value has been previously freed or not (by erasing from the set on freeing).

I've added back a finalizer for v8go.Value (just for the string value at the moment) and created a very simple http server to monitor memory allocations:

var mx sync.Mutex
var iso *v8go.Isolate

func handler(w http.ResponseWriter, r *http.Request) {
	mx.Lock()
	for i := 0; i < 10000; i++ {
		val, _ := v8go.NewValue(iso, "foobar")
		if i%1000 == 0 {
			fmt.Fprintf(w, "Value: %s-%d\n", val, i)
		}
	}
	mx.Unlock()
}

func main() {
	iso, _ = v8go.NewIsolate()

	http.HandleFunc("/", handler)
	log.Fatal(http.ListenAndServe(":8080", nil))
}

Please don't judge the code above; it's not the best!

With the server running, I fired some requests:

ab -n 500 -c 5 http://localhost:8080/

And monitored the allocations using XCode Instruments: image From the above test, we can see the memory being allocated (very quickly) but also the Go GC kicking in and the finalizer being run, thus freeing the memory.

This is what we want; however the story does not end there: when I run the test suite, I get a lot of errors (segmentation violation) from the finalizer.

So although the above "solution" looks promising, I need to dig deeper to what's causing the panic. (need to re-build v8 in debug).

Will keep going and see where I get.

rogchap avatar Jun 19 '21 02:06 rogchap

So this is the memory profile if we remove the value finalizer and create a new Isolate per request and then Dispose:

func handler(w http.ResponseWriter, r *http.Request) {
	iso, _ := v8go.NewIsolate()
	for i := 0; i < 10000; i++ {
		val, _ := v8go.NewValue(iso, "foobar")
		if i%1000 == 0 {
			fmt.Fprintf(w, "Value: %s-%d\n", val, i)
		}
	}
	iso.Dispose()
}

func main() {
	http.HandleFunc("/", handler)
	log.Fatal(http.ListenAndServe(":8080", nil))
}
image

As you can see the memory profile is WAY better (at the cost of having to create a new Isolate), only peaking at 12MB rather than the previous 200MB (and climbing).

So we really want to support both models; but consider the following (assuming we have the value finalizer back in):

iso, _ := v8go.NewIsolate()
val, _ := v8go.NewValue(iso, "some string")
fmt.Println(val.String())
iso.Dispose()

runtime.GC()
time.Sleep(time.Second)

When we call iso.Dispose() all of the isolates resources get de-allocated by V8, so when the Go GC runs, it calls the value finalizer, which then tries to free objects that have already been freed; ie. panics.

So you may think the "easy" solution is maintain a flag on when the Isolate is disposed; well, we already have this as the internal Isolate C pointer is set to nil when we dispose, so if we attach the isolate to every value we could do a check for this something like:

func (i *Isolate) Dispose() {
	if i.ptr == nil {
		return
	}
	C.IsolateDispose(i.ptr)
	i.ptr = nil
}

func (v *Value) finalizer() {
	if v.iso.ptr != nil {
		C.ValueFree(v.ptr)
	}
	v.ptr = nil
	runtime.SetFinalizer(v, nil)
}

The tricky problem here is that we are not in control of when the GC and the value finalizers run. It's possible (and you see this a lot when running the above server with the value finalizer) that C.ValueFree() can still be called before the C.IsolateDispose() function returns.

iptr := i.ptr
i.ptr = nil
C.IsolateDispose(iptr)

☝️ This helps to reduce the panics, but they still happen; I guess that values are being GC'd before the Dispose() and when we Dispose() the timing aligns where a C.ValueFree still gets called after the Dispose()

rogchap avatar Jun 19 '21 05:06 rogchap

This is a "fully" GCd setup, where there is no iso.Dispose() and we do this on a Isolate finalizer; the server is the same as before, creating a new Isolate per request.

image The memory keeps climbing (goes up to 750MB here) until all the requests are finished, then does GC some of the Isolates (not many though). The benefit is that the Dispose only gets called after all the values are freed first; but the Go GC doesn't call finalizer on everything as it does think it needs to. If you had constant request,s you would soon get a "out of memory" exception. I currently can't see how a "fully GC'd" solution is viable at this point.

rogchap avatar Jun 19 '21 05:06 rogchap

Fundamentally, the GC can't be designed to manage memory growth for a heap that it doesn't know about. Otherwise, it won't know that when that memory it doesn't manage is at a threshold where garbage collection should be triggered to avoid that memory growing.

For Go to know about the memory used by C/C++ code, I think it would have to use the C.malloc and C.free functions provided by cgo. Although v8 has support for overriding the array buffer allocator, I don't see support for overriding allocations in general so that Go would be aware of the memory allocations. It is the same problem with other languages, for example ruby also has an external malloc problem when interacting with libraries that use malloc instead of the allocation functions that ruby provides.

So even if finalizers were set to automatically release v8 values, we would avoid using them in favour of a manual method for releasing the V8 values in order to avoid this memory growth problem.

There is still a danger in forgetting to release memory by relying on manual memory management. Ideally, that could be caught from tests, however, I think something needs to be added to support this. For instance, if we could get the count of v8go.Values that weren't released, then we could check that before and after some code to make sure it isn't leaking values, possibly with an initial run of that code if it is expected to lazily initialize something once. Alternatively, maybe this could be done using Isolate.GetHeapStatistics() if a method were exposed to force full garbage collection.

dylanahsmith avatar Jun 21 '21 21:06 dylanahsmith

Fundamentally, the GC can't be designed to manage memory growth for a heap that it doesn't know about. Otherwise, it won't know that when that memory it doesn't manage is at a threshold where garbage collection should be triggered to avoid that memory growing.

The first sentence is key. And the conclusions should be to not attempt to make the Go GC manage the V8 memory. Let the V8 GC manage its stuff and the Go GC the objects on its side of the fence. But that requires careful consideration on what points to where and to "orchestrate" the GCs a bit.

For Go to know about the memory used by C/C++ code, I think it would have to use the C.malloc and C.free functions provided by cgo. Although v8 has support for overriding the array buffer allocator, I don't see support for overriding allocations in general so that Go would be aware of the memory allocations.

Go would then know about the allocations but Go's GC still wouldn't be able to make any use of it because it ignores cgo heap. So if all one would take from this is getting stats about how much memory has been allocated in V8 in order to manually trigger then Go GC then there are easier functions available.

So even if finalizers were set to automatically release v8 values, we would avoid using them in favour of a manual method for releasing the V8 values in order to avoid this memory growth problem.

Manual memory management can be a reasonable choice for performance reasons. It will lower the amount of allocated memory and reduce the amount of CPU consumed to run a GC. Both strategies can be done at the same time but one has to be very careful about race conditions especially because in Go as soon as finalizers are introduced one is in multithreaded territory since the GC runs on other threads.

There is still a danger in forgetting to release memory by relying on manual memory management. Ideally, that could be caught from tests, however, I think something needs to be added to support this. For instance, if we could get the count of v8go.Values that weren't released, then we could check that before and after some code to make sure it isn't leaking values, possibly with an initial run of that code if it is expected to lazily initialize something once. Alternatively, maybe this could be done using Isolate.GetHeapStatistics() if a method were exposed to force full garbage collection.

If all one wanted to do is catch leaks with manually managed memory then the solution is quite simple: put finalizers on the Values which just panic so you get a stacktrace and remove the finalizer when the Value is Free()'d. That way you are using the Go GC as a leak detector for pretty much free since the finalizer would only run if there's no reference to the object in Go anymore but it was not properly freed.

arctica avatar Jun 22 '21 07:06 arctica

This is what we want; however the story does not end there: when I run the test suite, I get a lot of errors (segmentation violation) from the finalizer.

So although the above "solution" looks promising, I need to dig deeper to what's causing the panic. (need to re-build v8 in debug).

That is because as soon as finalizers are brought into the equation, one is in multithreaded territory and has to employ locking to prevent race conditions.

This helps to reduce the panics, but they still happen; I guess that values are being GC'd before the Dispose() and when we Dispose() the timing aligns where a C.ValueFree still gets called after the Dispose()

Yes and I don't believe it is possible to make the code correct without proper locking. You can't even assume that checking or setting a pointer value like i.ptr is threadsafe. You'd need to use the functions from sync/atomic but even then one has to be very careful and not everything is possible with just atomics.

The memory keeps climbing (goes up to 750MB here) until all the requests are finished, then does GC some of the Isolates (not many though). The benefit is that the Dispose only gets called after all the values are freed first; but the Go GC doesn't call finalizer on everything as it does think it needs to. If you had constant request,s you would soon get a "out of memory" exception. I currently can't see how a "fully GC'd" solution is viable at this point.

If the isolate objects are unreachable then the Go GC will run the finalizers. All of them, 100% guaranteed. The problem though is that the GC might not run until quite some time later. Especially if there are no new allocations happening as in your example whith a finite amount of http requests. Then it says "What's the point anyways? Nobody is asking for more memory right now so why free it?". It will periodically (iirc every 2 minutes) run anyways so if you wait at the end you should eventually see all Isolates being disposed. Or alternateively you can just add a runtime.GC() call. But if one already knows that the Isolates are not needed anymore at this point in time then yes manually disposing them can be quite an advange.

Running with garbage collection is a tradeof. One trades CPU usage and especially memory usage for convenience. In the example of a webserver where at the end of the request one can just throw all the JS resource away because it was used just for some temporary computations then manually disposing the Isolate (or a Context) can have big benefits. Some usage scenarios will benefit more, some less.

arctica avatar Jun 22 '21 07:06 arctica

Go would then know about the allocations but Go's GC still wouldn't be able to make any use of it because it ignores cgo heap. So if all one would take from this is getting stats about how much memory has been allocated in V8 in order to manually trigger then Go GC then there are easier functions available.

As you say, go would make use of it for stats, but the point is that Go would trigger the Go GC manually (i.e. on-demand), rather than requiring another hook to do this on-demand before the v8 heap is grown. It isn't that Go would free things in the cgo heap directly, but it would be freed indirectly through the finalizers.

I'm not sure what you are referring to by there being "easier functions available". If there are appropriate hooks available in v8 to trigger the Go GC before it increases the v8 heap size, then that would be interesting, since I didn't notice anything. Something similar could be done in the Go application code by using runtime.GC() but it wouldn't really be on-demand so wouldn't be as effective at keeping the heap from growing.

If all one wanted to do is catch leaks with manually managed memory then the solution is quite simple: put finalizers on the Values which just panic so you get a stacktrace and remove the finalizer when the Value is Free()'d.

Yes, this would be a way v8go could add support for checking this manual memory management, if it weren't using the finalizers to automatically free memory. Although, the test code would still need to use runtime.GC() after each of these tests to trigger the garbage collection to both ensure that it happens and to have the failure occur in the test that caused the failure.

dylanahsmith avatar Jun 22 '21 14:06 dylanahsmith

I'm not sure what you are referring to by there being "easier functions available". If there are appropriate hooks available in v8 to trigger the Go GC before it increases the v8 heap size, then that would be interesting, since I didn't notice anything. Something similar could be done in the Go application code by using runtime.GC() but it wouldn't really be on-demand so wouldn't be as effective at keeping the heap from growing.

I'm not sure if there is a hook available for when the heap is grown (one could always create one via a wrapper around the default allocator as you mentioned before) but what I ment is that one can use Isolate::GetHeapStatistics() or similar to get the heap numbers in order to decide if a manual GC is in order instead of tracking every single allocation. You can call this method periodically or in one of the GC callbacks. What I am doing instead though is I've added a pre-GC hook in V8 which calls into Go to run there runtime.GC(). The idea being that the V8 GC will (hopefully) be triggered if the V8 heap grows sufficiently and by pre-empting it and running the Go GC, we can trigger the finalizers and free as many V8 objects as possible.

Yes, this would be a way v8go could add support for checking this manual memory management, if it weren't using the finalizers to automatically free memory. Although, the test code would still need to use runtime.GC() after each of these tests to trigger the garbage collection to both ensure that it happens and to have the failure occur in the test that caused the failure.

Maybe one could allow the library to take a callback that is called whenever something is freed via finalizer. A test could then fail. In production one could log it or track metrics etc. Or one could just not specify the callback and rely on automatic memory management.

arctica avatar Jun 22 '21 17:06 arctica

What I am doing instead though is I've added a pre-GC hook in V8 which calls into Go to run there runtime.GC(). The idea being that the V8 GC will (hopefully) be triggered if the V8 heap grows sufficiently and by pre-empting it and running the Go GC, we can trigger the finalizers and free as many V8 objects as possible.

Looks like you are referring to v8::Isolate::AddGCPrologueCallback. That does sound like it could work well to manage the heap size, although it does bring the risk of performance problems from triggering the Go GC too often if you are using multiple V8 isolates. For example, if you are using multiple V8 Isolates and they all get close to reaching a garbage collection threshold, then the Go GC could get triggered once for each V8 isolate. It seems like this could lead to scalability issues. Some trade-offs could be made to run the Go GC less frequently for performance at the cost of higher memory usage from less effective V8 garbage collections, but that seems more like it would be shifting overhead rather than getting rid of it.

I still think it makes sense to support manually calling a method on v8go.Value objects to release/untrack the underlying v8::Value.

dylanahsmith avatar Jun 23 '21 18:06 dylanahsmith

Looks like you are referring to v8::Isolate::AddGCPrologueCallback. That does sound like it could work well to manage the heap size, although it does bring the risk of performance problems from triggering the Go GC too often if you are using multiple V8 isolates. For example, if you are using multiple V8 Isolates and they all get close to reaching a garbage collection threshold, then the Go GC could get triggered once for each V8 isolate. It seems like this could lead to scalability issues. Some trade-offs could be made to run the Go GC less frequently for performance at the cost of higher memory usage from less effective V8 garbage collections, but that seems more like it would be shifting overhead rather than getting rid of it.

Yes that's the method I am using. You are correct that using several Isolates that each can trigger their GC would then result in the host Go app running its own GC every time which can be an issue if the host app has a big heap. But I'm not simply calling runtime.GC() each time the V8 GC callback is invoked because in my testing those callbacks get called very frequently and are actually not that useful because there are many phases and not just one big garbage collection pass. In the callback I am grabbing the current heap statistics and track them for all active Isolates. That allows for the implementation of an algorithm similar to what Go itself is using where it simply does a big collection once the heap has increased a certain percentage. I implemented three trigger conditions: 1. A certain amount of time passed since the last GC (e.g. 10s) 2. The total heap size changed a certain percentage (e.g grew to 2x) 3. The JS heap limit is about to be reached for a given Isolate.

I still think it makes sense to support manually calling a method on v8go.Value objects to release/untrack the underlying v8::Value.

I fully agree with that. It's best to have the choice between manual and automatic memory management. Heck, I wish it was possible to manually free Go objects :) But Go being a garbage collected language I think that having automatic memory management is a must. Manual memory management should be a bonus that can be used for performance optimization. In my tests, using manual memory management can result in vastly reduced memory usage when creating tons of Isolates or Contexts because these are quite heavy. On the other hand when just allocating Values inside a few long lived Contexts then the garbage collector while using more memory was about 20% faster. I always depends on the real workload.

arctica avatar Jun 23 '21 21:06 arctica

Thanks @arctica and @dylanahsmith for your detailed thoughts.

It seems to me that we are in agreement to have some sort of "free" method.

What I thought might be good would be a iso.Prune() that would free all the values that are being tracked against that isolate.

It would basically do the same thing as Dispose() does, but without actually disposing of the isolate so that it can be re-used.

I see this method being called just before you place an Isolate back into a sync.Pool for example.

We could do the same for ctx.Prune() if you need to re-use Contexts. Although I think calling ctx.Close() and creating a new Context better, so that you're not leaking state.

If we had Prune() (let me know if there is a better name) do you think we still need val.Free() on individual values? Is that too fine grained?

I'll probably mark these methods as "Experimental" and subject to change for now.

rogchap avatar Jun 23 '21 22:06 rogchap

This is my first test case; creates a single Isolate that is reused and calls iso.Prune() at the end of each request (obviously a bit crap as this reduces concurrency to 1, but still a good test):

var mx sync.Mutex
var gblIsolate *v8go.Isolate

func init() {
	gblIsolate, _ = v8go.NewIsolate()
}

func handler(w http.ResponseWriter, r *http.Request) {
	mx.Lock()
	iso := gblIsolate
	defer func() {
		iso.Prune()
		mx.Unlock()
	}()

	for i := 0; i < 10000; i++ {
		val, _ := v8go.NewValue(iso, "foobar")
		if i%1000 == 0 {
			fmt.Fprintf(w, "Value: %s-%d\n", val, i)
		}
	}
}

func main() {
	http.HandleFunc("/", handler)
	log.Fatal(http.ListenAndServe(":8080", nil))
}
image Memory profile looks good.

Second test case is using sync.Pool which still uses the Go GC to free unused Isolates so we create our own finalizer for this to call Dispose():

func finalizer(iso *v8go.Isolate) {
	iso.Dispose()
	runtime.SetFinalizer(iso, nil)
}

var ipFree = sync.Pool{
	New: func() interface{} {
		iso, _ := v8go.NewIsolate()
		runtime.SetFinalizer(iso, finalizer)
		return iso
	},
}

func isoFree(iso *v8go.Isolate) {
	iso.Prune()
	ipFree.Put(iso)
}

func handler(w http.ResponseWriter, r *http.Request) {
	iso := ipFree.Get().(*v8go.Isolate)
	defer isoFree(iso)

	for i := 0; i < 10000; i++ {
		val, _ := v8go.NewValue(iso, "foobar")
		if i%1000 == 0 {
			fmt.Fprintf(w, "Value: %s-%d\n", val, i)
		}
	}
}

func main() {
	http.HandleFunc("/", handler)
	log.Fatal(http.ListenAndServe(":8080", nil))
}

image

Memory profile for this looks good too.

There are tradeoffs here, as sync.Pool still creates a lot of new Isolates, but does also cleans the pool during GC, so depends on what you want to benchmark against. What I like about this setup, is that it's left to the caller to implement the finalizer rather than the v8go package.

I think having the Prune() methods would allow you to create your own Pool that used X number of Isolates, for example.

rogchap avatar Jun 24 '21 04:06 rogchap

What I thought might be good would be a iso.Prune() that would free all the values that are being tracked against that isolate.

Doesn't that prevent the ability to hold references to v8 values.

One use case we have for holding onto references is as a part of initializing the isolates global context by setting values on it, but holding onto those values so they can be later used without having to get them back out of the context. This is beneficial for performance, since it is faster in Go to just access a struct field than to get a value from the v8 context. It also avoids the possibility of errors being introduced, such as from the JS code setting those initial values to something else.

Another use case we have for holding onto values is to be able to use an isolate for multiple tasks concurrently. Your approach with using an isolate pool would end up requiring a lot more isolates, and thus a lot more memory usage, to have isolates that are blocking on IO. Instead, we have a goroutine that owns the isolate and pulls tasks from a channel to start working on them, so it can start working on multiple tasks concurrently. Multiple isolates can then be added for parallelism, where their goroutines can pull new work from the same shared channel but have separate channels to handle IO completion. With this approach, the isolate could be kept busy with new tasks and not have a natural time to use iso.Prune(), which will require a similar workaround as we currently have where the goroutine essentially needs a "graceful restart" (forcing it to wait for existing tasks to fully complete) to free up memory.

As a result, an Isolate.Prune() method wouldn't be a replacement for a method like v8go.Value.Release().

dylanahsmith avatar Jun 24 '21 14:06 dylanahsmith

Doesn't that prevent the ability to hold references to v8 values.

If implemented correctly, then it would only prevent one from using (not holding) references to V8 values beyond the call to prune. You would still be able to use them before that and then can free them all at once. Sometimes this might be favorable.

As a result, an Isolate.Prune() method wouldn't be a replacement for a method like v8go.Value.Release()

Definitely.

arctica avatar Jun 25 '21 11:06 arctica

Your approach with using an isolate pool would end up requiring a lot more isolates, and thus a lot more memory usage, to have isolates that are blocking on IO.

@rogchap I think the v8go.Value.Release() is eventually a needed addition. I have a simple server running in the isolate that answers to simple events. This is a long-running process which leaks memory because the returned strings are bridged to the Go context. Or is there another to solution to this execution model?

katallaxie avatar Sep 10 '21 16:09 katallaxie

Apologies, I've left this one hanging... a lot of effort has gone in to find an optimal solution, but I've been unsatisfied with all my attempts.

I'll re-visit... and yes, I think it will have to be a value.Release() type function, but may add some other functions that are helpful depending on what you are trying to achieve.

I'll like mark them as "Experimental" and subject to change.

rogchap avatar Sep 12 '21 02:09 rogchap

The approach I'm looking into is to more closely match c++ API by:

  • turning Value into an alias to LocalValue
  • using a Isolate.HandleScope(func()) to explicitly enter a top-level handle scope, so local values can be batch freed at the end of it (this can also use the v8::Locker, reducing locking overhead for nested v8go calls)
  • implicitly introducing an v8::EscapableHandleScope around calls to the FunctionTemplate callback
  • adding Value.Persist() to create a PersistentValue that escapes the handle scope stack and can be freed with PersistentValue.Release()

Internally, I want to

  • track values on the isolate instead of the context (aside from its Context.Global)
  • store an integer index in Go for values instead of a pointer, so they can be checked to preserve memory safety as well as to reduce heap allocation
  • keep a separate vector for Persistent values
  • use a c++ bitset to mark free items in the persistent value vector, along with using a union for the persistent value vector to have it double as a free list for freed elements
  • use v8::Locker in Isolate.HandleScope to reduce nested locking overhead
  • reduce overhead for calling Go from C++ (e.g. in Isolate.HandleScope) by changing ctxRegistry into a slice (i.e. cheaper lookup) of isolates pointers that are registered and given an index on creation (i.e. less exclusive locking). The callback data can then be retrieved from fields on the Isolate (e.g. currentContext and handleScopeCallback)

I think the handle scope should make this convenient for the common case of working with local values. However, PersistentValue's manually memory management seems inconvenient, we could provide an alternative method to persist the local value, which wraps the PersistentValue in a heap allocated struct that gets auto-released through a GC finalizer. E.g. this could be obtained using a Value.GCed() method. Since the Go finalizers run in a separate goroutine, they would be queued to a slice on the Isolate to release later (e.g. at the start of Isolate.HandleScope), using a RWMutex to synchronize the use of this slice. The freeing could even be done immediately if the Isolate isn't in a handle scope, if the RWMutex used to synchronize setting the handleScopeCallback.

We could also add GC finalizers for Context and Isolate. For Context we can use the above approach of queuing the Close on the isolate. For Isolate, we should make sure any values that depend on it contain a reference to it, since finalizers are run in dependency order we can just use iso.Dispose() in the finalize.

This seems like the best place to give an overview of the changes I would like to make, since I would break the changes into multiple PRs and want someplace to refer back to. Of course, I would also love feedback on my proposed approach to solving this issue.

dylanahsmith avatar Sep 13 '21 17:09 dylanahsmith

The approach I'm looking into is to more closely match c++ API by:

* turning `Value` into an alias to `LocalValue`

* using a `Isolate.HandleScope(func())` to explicitly enter a top-level handle scope, so local values can be batch freed at the end of it (this can also use the v8::Locker, reducing locking overhead for nested v8go calls)

* implicitly introducing an `v8::EscapableHandleScope` around calls to the FunctionTemplate callback

* adding `Value.Persist()` to create a PersistentValue that escapes the handle scope stack and can be freed with `PersistentValue.Release()`

I think this makes a lot of sense given the goal of v8go to stay close to the V8 API. Performance-wise, I suspect the HandleScope() function could provide significant gains because one can runtime.LockOSThread() and as you mentioned v8::Locker right at the start and only once for a whole bunch of code.

I think the handle scope should make this convenient for the common case of working with local values. However, PersistentValue's manually memory management seems inconvenient, we could provide an alternative method to persist the local value, which wraps the PersistentValue in a heap allocated struct that gets auto-released through a GC finalizer. E.g. this could be obtained using a Value.GCed() method. Since the Go finalizers run in a separate goroutine, they would be queued to a slice on the Isolate to release later (e.g. at the start of Isolate.HandleScope), using a RWMutex to synchronize the use of this slice. The freeing could even be done immediately if the Isolate isn't in a handle scope, if the RWMutex used to synchronize setting the handleScopeCallback.

I think the key points are: 1. Finalizers can't free Values if a HandleScope is active since they run on a different goroutine and 2. they don't have to be free'd by the main isolate goroutine. I would therefore implement a separate goroutine for each Isolate that is tasked with freeing the Values. Imho it isn't needed to try to free Values at the start or end of HandleScape, after all who knows when the GC will finalize them. If that garbage slurping goroutine used a HandleScope itself to batch free multiple Values, then things become quite easy because all the thread safety is correctly handled.

I'd question though the choice of a RWMutex which is optimized for single writer multiple reader but that is not the case here. Since the finalizer goroutine just needs to append values and the other goroutine just reads the values and then signals what it has finished, maybe a simple slice with a few atomics can do the trick. Pseudo Code:

struct Isolate {
    ....
    finalizerQueue *[]*Value
}

// finalizer goroutine
// temporarily grab the slice pointer and signal we are busy
q := atomic.SwapPointer(&iso.finalizerQueue, 0x1)
q = append(*q, v)
atomic.StorePointer(&iso.finalizerQueue, q)

// garbage slurping goroutine (per isolate)
for {
    q := atomic.SwapPointer(&iso.finalizerQueue, nil)

    if q == 0x1 {
        // new garbage coming in, yield and try again
        time.Sleep(1*time.Millisecond)
        continue
    } else if q == nil {
        // maybe check if we should exit the goroutine here
        time.Sleep(1*time.Second)
        continue
    }

    iso.HandleScope(func() {
        for _, v := range *q {
            // free v
        }
    })
}

arctica avatar Sep 26 '21 20:09 arctica

So what is the best way to release Value memory for now?

pigLoveRabbit520 avatar Oct 19 '21 03:10 pigLoveRabbit520