go icon indicating copy to clipboard operation
go copied to clipboard

crypto/ed25519: Sign panics with private key allocated in memory-mapped region

Open rkerno opened this issue 3 months ago • 15 comments

Go version

go version go1.25.1 linux/amd64

Output of go env in your module/workspace:

go env
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE='on'
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/rkerno/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/rkerno/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4067861915=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/rkerno/src/m/golib/go.mod'
GOMODCACHE='/home/rkerno/go/pkg/mod'
GONOPROXY='gitlab.projectcatalysts.prv'
GONOSUMDB='gitlab.projectcatalysts.prv'
GOOS='linux'
GOPATH='/home/rkerno/go'
GOPRIVATE='gitlab.projectcatalysts.prv'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/rkerno/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.25.1'
GOWORK='/home/rkerno/src/m/go.work'
PKG_CONFIG='pkg-config'
uname -sr: Linux 6.6.87.2-microsoft-standard-WSL2
/lib/x86_64-linux-gnu/libc.so.6: GNU C Library (Debian GLIBC 2.31-13+deb11u7) stable release version 2.31.

What did you do?

After creating an ed25519.PrivateKey ([]byte) from memory backed by a guarded memory page under my control (not part of Go's memory allocations), ed25519.Sign() panics because the pointer in the slice is not allocated by go.

This is a issue that was introduced after version 1.23.1 (I've just upgraded my project).

This is the function that panics:

//go:linkname internal_weak_runtime_registerWeakPointer weak.runtime_registerWeakPointer
func internal_weak_runtime_registerWeakPointer(p unsafe.Pointer) unsafe.Pointer {
	return unsafe.Pointer(getOrAddWeakHandle(unsafe.Pointer(p)))
}

If I hold my cursor over the variable p, this message is reported:

(unreadable could not find loclist entry at 0x2d2d1 for address 0x483bb3)

The issue is caused by the use of the fips140 private key cache:

func sign(signature []byte, privateKey PrivateKey, message []byte) { k, err := privateKeyCache.Get(&privateKey[0], func() (*ed25519.PrivateKey, error) {

The reason for using a guarded page is so that I have explicit control over the lifetime of the private key, can prevent it from being included in crash dumps, and can ensure the backing memory is scrubbed when the key is no longer required. Introducing an (undocumented) requirement that the private key must be backed by go controlled memory is a regression,

What did you see happen?

fatal error: getWeakHandle on invalid pointer

goroutine 6 gp=0xc0001a2380 m=4 mp=0xc00004f808 [running]: runtime.throw({0x88d1e1?, 0x498497?}) /usr/local/go/src/runtime/panic.go:1094 +0x48 fp=0xc0001bea80 sp=0xc0001bea50 pc=0x484b48 runtime.getWeakHandle(0x750d2c4f0040) /usr/local/go/src/runtime/mheap.go:2639 +0xf2 fp=0xc0001bead0 sp=0xc0001bea80 pc=0x43a9d2 runtime.getOrAddWeakHandle(0x750d2c4f0040) /usr/local/go/src/runtime/mheap.go:2568 +0x2b fp=0xc0001beb20 sp=0xc0001bead0 pc=0x43a76b weak.runtime_registerWeakPointer(0x8b7430?) /usr/local/go/src/runtime/mheap.go:2451 +0x13 fp=0xc0001beb38 sp=0xc0001beb20 pc=0x483bb3 weak.Make... /usr/local/go/src/weak/pointer.go:74 +0x65 fp=0xc0001beb88 sp=0xc0001beb38 pc=0x63e9a5 crypto/internal/fips140cache.(*Cache[...]).Get(0x8bce80, 0x750d2c4f0040, 0xc0001bed88, 0xc0001bed68) /usr/local/go/src/crypto/internal/fips140cache/cache.go:33 +0x76 fp=0xc0001bed00 sp=0xc0001beb88 pc=0x63de36 crypto/ed25519.sign({0xc0000ba040, 0x40, 0x40}, {0x750d2c4f0040, 0x40, 0x40}, {0xc0001bf847, 0x19, 0x19}) /usr/local/go/src/crypto/ed25519/ed25519.go:189 +0x173 fp=0xc0001bee58 sp=0xc0001bed00 pc=0x63d393 crypto/ed25519.Sign({0x750d2c4f0040, 0x40, 0x40}, {0xc0001bf847, 0x19, 0x19}) /usr/local/go/src/crypto/ed25519/ed25519.go:184 +0xde fp=0xc0001beef8 sp=0xc0001bee58 pc=0x63d19ec

What did you expect to see?

This call worked in 1.23.1.

rkerno avatar Sep 17 '25 17:09 rkerno

Didn't know we have such good commit SHA1: ed25519 😄.

This is surely: CL 654096

CC @FiloSottile

mateusz834 avatar Sep 17 '25 17:09 mateusz834

We are just using the weak package, so I think this is a decision for the @golang/runtime team. If storing values in non-Go allocated memory is a supported use case, then weak should support it. If not, our keys are not special.

FiloSottile avatar Sep 17 '25 17:09 FiloSottile

We are just using the weak package, so I think this is a decision for the @golang/runtime team. If storing values in non-Go allocated memory is a supported use case, then weak should support it. If not, our keys are not special.

FYI there is also a runtime.AddCleanup call there, which surely is not going to work with a non-GC managed memory.

mateusz834 avatar Sep 17 '25 17:09 mateusz834

Oof, yeah, this is not a use-case supported by the weak package. The crypto libraries could ask the runtime if the slice is Go-managed, and if not, skip caching, maybe?

Disclaimer: I have very little context on what the crypto packages are caching.

mknyszek avatar Sep 17 '25 17:09 mknyszek

The crypto libraries could ask the runtime if the slice is Go-managed, and if not, skip caching, maybe?

Is there a public API to do so? Just curious.

mateusz834 avatar Sep 17 '25 17:09 mateusz834

FWIW, having something like #70224 (which is not going to be trivial) would make this possible, but there is still a hazard here of unmapping the memory prematurely (before the runtime stops tracking it).

A more narrow version of #70224, like an os package for making memory maps managed by the Go runtime, would be easier (but still not trivial). (If we cheat and make the minimum mmap size 8 KiB and force it to be aligned, that would be MUCH easier.) We could even execute cleanups and clear weak pointers on deterministic destruction, for example.

Is there a public API to do so? Just curious.

No, it would have to be an internal linkname.

mknyszek avatar Sep 17 '25 17:09 mknyszek

I am not sure if this is worth fixing, since a workaround is a 32B alloc+copy, but maybe just turning off the cache in non-fips mode would be fine? If i understand it correctly it is not needed then.

mateusz834 avatar Sep 17 '25 18:09 mateusz834

I prefer to avoid as much as possible divergence between FIPS and non-FIPS mode. If we start accruing small little codepath differences, eventually it will get confusing to reason about behavior when debugging etc.

FiloSottile avatar Sep 17 '25 18:09 FiloSottile

If a pairwise consistency test (PCT), as required by FIPS 140, is meant to verify that a private key and its corresponding public key form a valid keypair, how is that expected to work in this code path where only the private key is provided? I am just trying to use Go's standard library to sign some data using ed25519, and this use case shouldn't mandate FIPS 140 requirements.

rkerno avatar Sep 17 '25 20:09 rkerno

Some other thoughts...allocating a heap object for each signing call is inefficient, plus has the side effect that there would be a new allocation for each call, it would run through the slow path for the check for every signing operation, and each key copy would be added to the cache which would be guaranteed to grow for every signing call until the GC kicks in.

In any case, it doesn't actually make any sense in this context to derive a public key from the private key and then verify it, because all you're proving is your math is working. I understand you probably "have to" do this in order to tick the FIPS 140 box, but in my use case I am actually looking to place extra safeguards around the key itself.

I also don't know how this is supposed to work if you have keys managed externally via CGO - I imagine this use case would also fail.

rkerno avatar Sep 17 '25 21:09 rkerno

Looking back at your original post, it sounds like your main use-case for using a guarded page is to make sure the memory gets scrubbed and the private key doesn't leak in various diagnostics. It sounds like the proposal in #21865 would be helpful to you.

Some other thoughts...allocating a heap object for each signing call is inefficient, plus has the side effect that there would be a new allocation for each call, it would run through the slow path for the check for every signing operation, and each key copy would be added to the cache which would be guaranteed to grow for every signing call until the GC kicks in.

That's true, but my impression is (correct me if I'm wrong) that you're already reusing a pointer to a private key value that's just allocated specially. And again, this isn't my area so correct me if I'm wrong, but given that the API's cache is based on pointers, it seems to expect you to do the same. This doesn't seem totally unreasonable to me, but I don't know the provenance of these private keys are (you're probably reading them from somewhere -- are you re-reading them every time?).

I also don't know how this is supposed to work if you have keys managed externally via CGO - I imagine this use case would also fail.

That's also true. Personally, supporting this in the crypto APIs specifically this doesn't sound unreasonable to me, because there seem to be additional motivations for using non-Go memory in crypto APIs specifically. So for that reason, I am going to punt this back to @FiloSottile for a final decision on what the crypto APIs should support.

From the perspective of the runtime, it's unfortunate but also OK to have these APIs just ask if it's in non-Go memory before checking the cache.

@FiloSottile: to your question about what the weak package supports, it just can't support non-Go memory at all. There's a lot of missing infrastructure to make that work.

I think @rkerno's use-case is another good reason to have some kind Go runtime managed mmap-style API. You'd get some chunk of mapped memory, you'd be allowed to do stuff to it like mlock or whatever it is you need, and the runtime will track pointers into it and free it if it's no longer in use. With this tracking would also come all the infrastructure needed to make things like weak pointers and cleanups work out of the box.

mknyszek avatar Sep 17 '25 22:09 mknyszek

Oh and, 'doh, a possible workaround here is if you're managing the memory yourself anyway, you can probably get away with just making a relatively large pointer-free allocation (>32 KiB) and calling the kernel APIs you need on that. That (1) has all the tracking you need to interact with weak pointers and cleanups and (2) today, will be page-aligned. To make it a little less fragile you can probably just add a check to make sure that the allocation's pointer is indeed page-aligned, and crash if it isn't. That's at least safe (and, funny enough, doesn't even require unsafe!). You just have to make sure that if it's freed, you also undo any weird memory states (which you can do by setting a cleanup on it).

The problem here of course is if it isn't page-aligned in the future, so it's technically fragile and I'll have to preface this with "use at your own risk." But even if we have an mmap-style API that guarantees the allocation of whole pages that you can manipulate as you desire, it may very well just end up making regular allocations under the hood.

mknyszek avatar Sep 17 '25 23:09 mknyszek

Guys, I think you are over complicating things here. Use of the cache should require the FIPS feature flag to be set - I am not using it!

The documentation states:

FIPS 140-3 mode

The run-time fips140 GODEBUG option controls whether the Go Cryptographic Module operates in FIPS 140-3 mode. It defaults to off. It can’t be changed after the program has started.

When operating in FIPS 140-3 mode (the fips140 GODEBUG setting is on):

The Go Cryptographic Module automatically performs an integrity self-check at init time, comparing the checksum of the module’s object file computed at build time with the symbols loaded in memory.

All algorithms perform known-answer self-tests according to the relevant FIPS 140-3 Implementation Guidance, either at init time, or on first use.

Pairwise consistency tests are performed on generated cryptographic keys. Note that this can cause a slowdown of up to 2x for certain key types, which is especially relevant for ephemeral keys.

rkerno avatar Sep 18 '25 06:09 rkerno

An interesting intersection I just came across based on the suggestion by @rkerno .

in my go.mod I set

godebug (
	fips140=only
)

This in turn caused my Solana-Go client to fail to connect since X25519 is explicity disallowed in FIPS-140 mode.

Now, changing the setting to:

godebug (
	fips140=on
)

Allows connection to be established again, however I am right back to my signature call causing the dreaded fatal error: getWeakHandle on invalid pointer

Here is my call stack: Image

Here is the code from Solana-go's implementation in keys.go

func (k PrivateKey) Sign(payload []byte) (Signature, error) {
	if err := k.Validate(); err != nil {
		return Signature{}, err
	}
	p := ed25519.PrivateKey(k)
	signData, err := p.Sign(crypto_rand.Reader, payload, crypto.Hash(0))
	if err != nil {
		return Signature{}, err
	}

	var signature Signature
	copy(signature[:], signData)

	return signature, err
}

In my debugger, p is valid. I realize the discussion around the PublicKey not being passed in is 100% relevant. I'm just trying to add as much detail here as humanly possible.

This is in go 1.25.1.

I even just now did a go get -u -t ./... Same results, here is my go.mod

module GOLANGISMYFRIEND

go 1.25.1

//fips140=only will cause the Solana Go SDK to fail to connect to the endpoint due to X25519 being disallowed in FIPS-140 mode
//See

godebug fips140=on

require (
	github.com/gagliardetto/solana-go v1.13.0
	github.com/golang-jwt/jwt/v5 v5.3.0
	github.com/google/uuid v1.6.0
	github.com/joho/godotenv v1.5.1
	github.com/natefinch/lumberjack v2.0.0+incompatible
	go.etcd.io/bbolt v1.4.3
	google.golang.org/grpc v1.75.1
	google.golang.org/protobuf v1.36.9
)

require (
	contrib.go.opencensus.io/exporter/stackdriver v0.13.4 // indirect
	filippo.io/edwards25519 v1.1.0 // indirect
	github.com/andres-erbsen/clock v0.0.0-20160526145045-9e14626cd129 // indirect
	github.com/aybabtme/rgbterm v0.0.0-20170906152045-cc83f3b3ce59 // indirect
	github.com/benbjohnson/clock v1.3.5 // indirect
	github.com/blendle/zapdriver v1.3.1 // indirect
	github.com/buger/jsonparser v1.1.1 // indirect
	github.com/davecgh/go-spew v1.1.1 // indirect
	github.com/dfuse-io/logging v0.0.0-20201110202154-26697de88c79 // indirect
	github.com/fatih/color v1.18.0 // indirect
	github.com/gagliardetto/binary v0.8.0 // indirect
	github.com/gagliardetto/treeout v0.1.4 // indirect
	github.com/go-delve/delve v1.25.0 // indirect
	github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e // indirect
	github.com/golang/protobuf v1.5.4 // indirect
	github.com/gorilla/rpc v1.2.1 // indirect
	github.com/gorilla/websocket v1.5.3 // indirect
	github.com/json-iterator/go v1.1.12 // indirect
	github.com/klauspost/compress v1.18.0 // indirect
	github.com/logrusorgru/aurora v2.0.3+incompatible // indirect
	github.com/mattn/go-colorable v0.1.14 // indirect
	github.com/mattn/go-isatty v0.0.20 // indirect
	github.com/mitchellh/go-testing-interface v1.14.1 // indirect
	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
	github.com/modern-go/reflect2 v1.0.2 // indirect
	github.com/mostynb/zstdpool-freelist v0.0.0-20201229113212-927304c0c3b1 // indirect
	github.com/mr-tron/base58 v1.2.0 // indirect
	github.com/streamingfast/logging v0.0.0-20250918142248-ac5a1e292845 // indirect
	github.com/teris-io/shortid v0.0.0-20201117134242-e59966efd125 // indirect
	github.com/tidwall/gjson v1.9.3 // indirect
	github.com/tidwall/match v1.1.1 // indirect
	github.com/tidwall/pretty v1.2.0 // indirect
	go.mongodb.org/mongo-driver v1.17.4 // indirect
	go.opencensus.io v0.22.5 // indirect
	go.uber.org/atomic v1.11.0 // indirect
	go.uber.org/multierr v1.11.0 // indirect
	go.uber.org/ratelimit v0.3.1 // indirect
	go.uber.org/zap v1.27.0 // indirect
	golang.org/x/crypto v0.42.0 // indirect
	golang.org/x/net v0.44.0 // indirect
	golang.org/x/sys v0.36.0 // indirect
	golang.org/x/term v0.35.0 // indirect
	golang.org/x/text v0.29.0 // indirect
	golang.org/x/time v0.13.0 // indirect
	golift.io/rotatorr v0.0.0-20240723172740-cb73b9c4894c // indirect
	google.golang.org/genproto/googleapis/rpc v0.0.0-20250922171735-9219d122eba9 // indirect
)

stunney avatar Sep 25 '25 02:09 stunney