go icon indicating copy to clipboard operation
go copied to clipboard

proposal: runtime: add AddCleanup and deprecate SetFinalizer

Open mknyszek opened this issue 2 months ago • 11 comments

Background

Go provides one function for object finalization in the form of runtime.SetFinalizer. Finalizers are notoriously hard to use, and the documentation of runtime.SetFinalizer describes all the caveats with a lot of detail. For instance:

  • SetFinalizer must always refer to the first word of an allocation. This means programmers must be aware of what an 'allocation' is whereas that distinction isn't generally exposed in the language.
  • There cannot be more than one finalizer on any object.
  • Objects with finalizers that are involved in any reference cycle will silently fail to be freed and the finalizer will never run.
  • Objects with finalizers require at least two GC cycles to be freed.

The last two of these caveats boil down to the fact that runtime.SetFinalizer allows object resurrection.

Proposal

I propose adding the following API to the runtime package as a replacement for SetFinalizer. I also propose officially deprecating runtime.SetFinalizer.

// AddCleanup attaches a cleanup function to ptr, which is executed some time
// after ptr is no longer reachable. The cleanup function is executed with the
// argument cleanupValue.
//
// cleanupValue must not be equal to ptr and this function will panic if it is.
// If ptr is reachable from cleanup or cleanupValue, ptr will never be collected
// and the cleanup will never run.
//
// The cleanup function is not guaranteed to run in general, and also not guaranteed
// to run before program exit.
func AddCleanup[T, S any](ptr *T, cleanup func(S), cleanupValue S) Cleanup

// Cleanup is a handle to a cleanup function for a specific object.
type Cleanup struct { ... }

// Stop cancels the cleanup function. Stop will have no effect if the cleanup function
// has already been queued for execution once the object becomes unreachable. To
// guarantee that Stop removes the cleanup function, the caller must ensure that the
// pointer that was passed to AddCleanup is reachable across the call to Stop.
func (c Cleanup) Stop() { ... }

AddCleanup resolves many of the problems with SetFinalizer.

It forbids objects from being resurrected, resulting in prompt cleanup, as well as allowing cycles of objects to be cleaned up. Its definition also allows attaching cleanup functions to objects the caller does not own, and possibly attaching multiple cleanup functions to a single object.

However, it is still fundamentally a finalization mechanism, so to avoid restricting the GC implementation, it does not guarantee that the cleanup function will ever run.

Similar to finalizers' restriction on the object not being reachable from the finalizer function, ptr must not be reachable from the value passed to the cleanup function, or from the cleanup function. Usually this results in a memory leak, but the common case of accidentally passing ptr as s out of convenience can be easily caught.

In terms of interactions with finalizers, the cleanup function will always run the first time the value pointed to by ptr becomes unreachable. That is, if an object has both a cleanup function and a finalizer, the cleanup function is guaranteed to run before the finalizer. In other words, the cleanup function does not track object resurrection and will not run again if the finalizer does resurrect the object.

Design discussion

Avoiding allocations in the implementation of AddCleanup

AddCleanup needs somewhere to store cleanupValue until cleanup is ready to be called. Naively, it could just put that value in an any variable somewhere, but this would result in an unnecessary additional allocation.

In the actual implementation, a cleanup will be represented as a runtime "special," an off-heap manually-managed linked-list node data structure whose individual fields are sometimes explicitly inspected by the GC as roots, depending on the "special" type (for example, a finalizer special treats the finalizer function as a root).

Since each "special" is already specially (ha) treated by the GC, we can play some interesting tricks. For example, we could type-specialize specials and store cleanupValue directly in the "special." As long as we retain the type information for cleanupValue, we can get the GC to scan it directly in the special. But this is quite complex.

To begin with, I propose just specializing for word-sized cleanup values. If the cleanup value fits in a pointer-word, we store that directly in the special, and otherwise fall back on the equivalent of an any. This would cover many use-cases. For example, cleanup values that are already heap-allocated pointers wouldn't require an additional allocation. Also, simple cases like passing off a file descriptor to a cleanup function would not create an allocation.

Why func(S) and not func()?

The choice to require an explicit parameter to the cleanup function is to reduce the risk of the cleanup function accidentally closing over ptr. It also makes it easier for a caller to avoid allocating a closure for each cleanup.

Why func(S) and not chan S?

Channels are an attractive alternative because they allow users to build their own finalization queues. The downside however is that each channel owner needs its own goroutine for this to be composable, or some third party package needs to exist to accumulate all these channels and select over them (likely with reflection). It's much simpler if that package is just the runtime: there's already a system goroutine to handle the finalization queue. While this does mean that the handling of finalization is confined to an implementation detail, that's rarely an issue in practice and having the runtime handle it is more resource-efficient overall.

Why return Cleanup instead of *Cleanup?

While Cleanup is a handle and it's nice to represent that handle's unique identity with an explicit pointer, it also forces an allocation of Cleanup's contents in many cases. By returning a value, we can avoid an additional allocation.

Why not have the type arguments (T and/or S) on Cleanup too?

It's not necessary for the implementation of Cleanup for the type arguments to be available, since the internal representation will not even contain a reference to ptr, cleanup, or cleanupValue directly. It does close the door to obtaining these values from Cleanup in a type-safe way, but that's OK: the caller of AddCleanup can already package those up together if it wants to.

mknyszek avatar May 20 '24 20:05 mknyszek