go icon indicating copy to clipboard operation
go copied to clipboard

proposal: x/debug: make the `core` and `gocore` packages public

Open mknyszek opened this issue 2 years ago • 1 comments

Authors: @mknyszek @prattmic @aclements

Purpose

The purpose of this proposal is to help fill in an important gap in Go debugging by providing a foundation for Go heap analysis. The primary use-cases for heap analysis (vs. heap profiles) are to pinpoint memory leak issues and debug out-of-memory errors by providing an in-depth look at the memory graph. Other use-cases exist as well, mostly related to the large amount of detail available in typical heap analysis software.

Go currently has two heap analysis options: analyzing core files for Go processes (via golang.org/x/debug/cmd/viewcore) or analyzing heap dumps (produced by runtime/debug.WriteHeapDump). As of today, both of these options are not easily available to users. The former has bit-rotted, while the latter lacks working tooling (and so may not actually work either).

Some companies using Go have expressed interest in building their own heap analysis tooling, while members of the Go community have contributed time and effort to updating viewcore. The utility of such functionality is lauded in the Java world, especially when it's available for servers over a connection (e.g. net/http/pprof).

Plan

We propose the following plan for improving the state of Go heap analysis, with the core API change here being to make the golang.org/x/debug/internal/core and golang.org/x/debug/internal/gocore packages public.

Before that can happen, however, we also need to fix them and prevent them from breaking again by:

  1. Updating gocore and core to work with Go at tip.
  2. Ensure that gocore and core are well-tested, and tested at tip.
    1. Running the full test suite for a core dump created from a Go binary built at tip.
    2. Setting up the golang.org/x/debug builder to run against the main Go repository.

After that, I propose making gocore and core externally-visible APIs, allowing Go developers to build on top of it with some fairly minor changes to the API that I would like to hash out. To begin with, I would like to also propose to make these APIs not Go 1 compatible (more on this below).

Some proposed changes to the API include:

  1. Consider exporting gocore.region (very useful API for type-casting memory regions).
  2. Consider moving gocore.region into core.
  3. Iterator-style interfaces for some functions/methods.
  4. Write go.dev documentation on how to do heap analysis with viewcore.

These small steps would bring back heap analysis to the Go ecosystem and will ensure it continues to work.

We feel that the existing gocore library is sufficiently abstract such that it likely will not break between Go releases, but marking it as not Go 1 compatible allows us to be more flexible with the API as the runtime itself changes (whose details it intimately relies upon). We may also choose to make it Go 1 compatible if we can ensure users have enough escape hatches to discover additional details (though not necessarily in a backwards-compatible way) while minimizing the API to what we're comfortable with assuming about a Go process. The difference between these options is admittedly small; the existing API is already unlikely to change much over time.

viewcore, though minimalistic, provides a proof-of-concept for the gocore library that we hope will motivate Go developers to build fancier tooling for heap analysis. Documentation will be critical to making this tooling more widely known ("how to debug a memory leak with viewcore").

Potential future work

I want to also mention some ideas for future work to gauge interest, even though they're not relevant to the proposal directly.

  1. Stub out runtime/debug.WriteHeapDump to reduce eliminate some technical debt.
  2. Provide a way to construct a core file at runtime and expose it through net/http (perhaps via WriteHeapDump).
  3. Reduce dependence on DWARF information for type-casting regions.

The goal of (1) and (2) is to replace WriteHeapDump with a way to produce a process snapshot that works with existing tooling (i.e. gocore and viewcore). Constructing our own core file isn't terribly difficult, provides us with a stable binary format specification so we don't have to go through the trouble of defining our own, and is a format already automatically generated by Linux, so we have fewer formats to support.

The goal of (3) would be to improve the debugging experience with core dumps from stripped binaries. This might not need any additional API, but should be considered up-front.

mknyszek avatar Dec 22 '22 17:12 mknyszek

This would be an incredible help to the ecosystem! viewcore is an absolutely key tool for developers of CockroachDB, and it's been challenging to be able to contribute to viewcore because of its position as an internal package.

I created some basic viewcore documentation for CockroachDB developers, in case this prior art is useful at all. It also contains a pointer to the branch I've been maintaining with some extra tools that allow dumping the viewcore output into something that can be consumed by pprof to create a very primitive object graph viewer.

https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/1931018299/Using+viewcore+to+analyze+core+dumps

jordanlewis avatar Dec 22 '22 20:12 jordanlewis

Change https://go.dev/cl/468916 mentions this issue: cmd/coordinator: add linux-amd64 x/debug trybots to the main repo

gopherbot avatar Feb 17 '23 17:02 gopherbot

I think the main thing missing here is actually what the final API itself will look like. After an offline discussion today with @aclements and @prattmic, I think we've decided the right next step is to fork these x/debug-internal libraries into x/exp where we can iterate on them freely. The API is far too large to try to summarize for a proposal and have everyone comment on it. Meanwhile, x/exp will allow us to put it out there in a way that will allow for experimentation and trial.

The reason we chose x/exp over x/debug is because x/exp has clear expectations with respect to backwards compatibility (i.e. there is none). Meanwhile, x/debug contains only one exported package with zero imports in the Go ecosystem and a handful in the main Go repository (for testing). Although x/debug has a disclaimer on it with respect to compatibility, it is less clearly experimental in its import path and long-term we do want it to maintain backwards compatibility, at least at compile time. By experimenting directly in x/debug, we risk tarnishing its reputation, so-to-speak.

To keep viewcore working as a tool, I think our plan is to temporarily have x/debug import x/exp, so it can take advantage of improvements we make.

Frankly, I don't expect the API to change all that much, but simultaneously it's hard to get a sense of the API in totality without first using it and iterating on it.

mknyszek avatar Feb 22 '23 03:02 mknyszek

I updated the proposal to reflect these changes, and also updated the proposal's compatibility story based on our current thinking.

mknyszek avatar Feb 22 '23 03:02 mknyszek

Change https://go.dev/cl/474541 mentions this issue: debug: add module

gopherbot avatar Mar 08 '23 18:03 gopherbot

Change https://go.dev/cl/474543 mentions this issue: debug/gobinary: add initial package sketch

gopherbot avatar Mar 08 '23 18:03 gopherbot

Change https://go.dev/cl/474542 mentions this issue: debug/gobinary: add package

gopherbot avatar Mar 08 '23 18:03 gopherbot

Change https://go.dev/cl/506558 mentions this issue: internal/core: refactor loading for maintainability

gopherbot avatar Jun 27 '23 17:06 gopherbot

Change https://go.dev/cl/506757 mentions this issue: internal/gocore: refactor loading for maintainability

gopherbot avatar Jun 28 '23 15:06 gopherbot

@mknyszek @jordanlewis is there any update now about viewcore in golang.org/x/debug or golang.org/x/exp, please? do we plan to add one tool to parse the dump file from debug.WriteHeapDump()?

superajun-wsj avatar Oct 25 '23 11:10 superajun-wsj

Hey @superajun-wsj, unfortunately the focus on diagnostics lately has been on execution tracing. We still plan to come back to this in the near future and there's been a little bit of progress in June by @prattmic, but this is still a little ways off.

Thanks for your interest, and stay tuned.

mknyszek avatar Oct 25 '23 14:10 mknyszek

Change https://go.dev/cl/547536 mentions this issue: internal/gocore: handle unsafe.pointer case for hchan.buf in chan

gopherbot avatar Dec 06 '23 05:12 gopherbot

Hi, thought you might like to know I got viewcore to work with binaries compiled by Go 1.20 and 1.21.

It's somewhat empirical, definitely not complete, and I haven't tried to make it compatible with multiple versions, but my branch it's at https://github.com/golang/debug/compare/master...bboreham:go-debug:go-1-20

bboreham avatar Jan 03 '24 11:01 bboreham

I'm running into this issue with the latest version of go-debug and a dump generated by go 1.21: https://github.com/golang/go/issues/33661

csdenboer avatar Apr 19 '24 15:04 csdenboer

Change https://go.dev/cl/589035 mentions this issue: internal/gocore: fix funcdata offset

gopherbot avatar May 29 '24 15:05 gopherbot

Hi, @mknyszek , I've been trying to use viewcore recently, and I've found some strange errors. I'm also trying to fix some of them. Friendly ping, is there any latest update or specific milestone for viewcore? For example, go1.23 go1.24
By the way, Are there any plans or reference implementations for supporting variable types on the goroutine stack? Thank you !

WangLeonard avatar May 30 '24 08:05 WangLeonard

@WangLeonard The current problem with viewcore is that the packages that back the implementation (mainly gocore) have not been updated for the Go 1.22 heap layout, which will probably require some refactoring to make work. Unfortunately, there is no specific milestone for viewcore and the core analysis packages at the moment.

mknyszek avatar May 30 '24 16:05 mknyszek

Change https://go.dev/cl/593680 mentions this issue: internal/gocore: fix handling of slice type alias

gopherbot avatar Jun 21 '24 14:06 gopherbot

Change https://go.dev/cl/593681 mentions this issue: internal/gocore: use AttrGoRuntimeType dwarf information to match runtime type

gopherbot avatar Jun 21 '24 14:06 gopherbot

Hi all, I recently implemented a heap reference analysis tool for dwarf type retrieval based on delve. It might be relevant to the problem mentioned in this issue. Welcome to discuss and exchange ideas with us. repo: https://github.com/cloudwego/goref

jayantxie avatar Jul 11 '24 12:07 jayantxie

Change https://go.dev/cl/600796 mentions this issue: internal/gocore: support ClassLocListPtr location class

gopherbot avatar Jul 25 '24 15:07 gopherbot

There are feature requests on debug/gosym, such as #58474, which would be folded into the new x/debug packages. Though the title is about the gocore package, we envision having an API that operates only on a binary (without a core file). These APIs would be similar, with the core version obviously able to provide more complete information using data from the heap, etc. In some ways, these x/debug packages are a v2 of debug/gosym.

vulncheck (cc @zpavlinovic) is particularly interested in this work, as they have custom forks of debug/gosym and rsc.io/goversion/version to provide the extra functionality they need. These are a maintenance burden outside, so they'd like to move to supported packages ASAP, even while the API may not yet be finalized. If we start work in x/exp/debug (or x/debug/exp?) I think it would not be too much trouble to update vulncheck as needed as the API evolves since it is a first-party use.

prattmic avatar Jul 25 '24 19:07 prattmic