Invalid memory address or nil pointer dereference
I'm seeing this error in production in combination with skipper. Unfortunately I wasn't able to get the exact request yet. Do you maybe have some pointers (pun intended) how to debug this?
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xa9dcd9]
goroutine 57768117 [running]:
github.com/yookoala/gofast.(*client).readResponse.func1(0xc00ac4e000, 0xc00a84aea0, 0xc00a81d6c0, 0xc004484540)
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:211 +0x29
created by github.com/yookoala/gofast.(*client).readResponse
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:208 +0xad
Relevant code segment: https://github.com/yookoala/gofast/blob/v0.4.0/client.go#L199-L239
gofast is being used here https://github.com/zalando/skipper/blob/c50a31135e44b928cf47f572c677e827d7a942b3/proxy/fastcgi/fastcgi.go#L62
This seems strange. Does it happen occasionally or does it happen to every requests?
The file did not show how you used your RoundTripper created by NewRoundTripper. Is it use-and-throw or do you reuse it multiple times?
As you can see, you'd close the client after each RoundTripper.RoundTrip() call. If you reuse that RoundTripper, you'd probably be using a handler on an already closed client connection, which would become nil after Client.Close().
The design of gofast is to have the ClientFactory reused, not the Client.
I'll have a more friendly error for this situation in the future.
@yookoala Thanks for your thorough investigation on this issue. I'm the author of the gofast implementation in Skipper. I will try to create a PR to include your suggestion.
I checked, and the RoundTripper is actually created for every single request. It is never reused.
Hmm... I'll do some further check.
I'm not sure if it's related, but before the segmentation violation there are sometimes timeouts. The skipper instance runs in Kubernetes where the memory usage keeps rising over time. I also see sometimes out of memory kills (OOM). There might be a memory leak somewhere. I did not profile it yet, hope to get to it next week.
[APP]time="2020-09-10T08:48:14Z" level=error msg="Failed to roundtrip to fastcgi: gofast: error stream from application process gofast: timeout or canceled"
[APP]time="2020-09-10T08:48:14Z" level=error msg="error while proxying after 289.780249ms, route kube_some_route with backend lb fastcgi://10.0.84.101:9000, status code 500: dialing failed false: gofast: error stream from application process gofast: timeout or canceled, remote host: some.ip.address, request: \"GET /search/autocomplete HTTP/1.1\", user agent: \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/some.ip.address Safari/537.36\""
some.ip.address - - [10/Sep/2020:08:48:14 +0000] "GET /search/autocomplete?query=mill%C3%A9si HTTP/1.1" 500 22 "https://some-domain" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/some.ip.address Safari/537.36" 289 some-domain - -
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xa9dcd9]
goroutine 14961651 [running]:
github.com/yookoala/gofast.(*client).readResponse.func1(0xc007976000, 0xc006f16e80, 0xc006c49400, 0xc006357bc0)
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:211 +0x29
created by github.com/yookoala/gofast.(*client).readResponse
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:208 +0xad
I'm not working often in Golang, please forgive me if what I say makes no sense. I've been looking for potential memory that is not cleaned up.
-
Maybe
c.ids.Release(reqID)is not called in certain error cases The ID is allocated and later released. What if there is an error like this, will the ID be released back in the pool? -
Looking at one Skipper pod I see the typical saw-tooth graph of a memory leak. I'm not sure if this has anything to do with the gofast usage or something in Skipper. If you have a tip how I can find out more please let me know.
I can see goroutines increasing:
And live objects:

The design of gofast is to have the ClientFactory reused, not the Client.
@yookoala mentioned that the ClientFactory should be reused in the comment.
However, if I read it right, the ClientFactory is created on every request.
@yookoala @ruudk Should this not be changed in Skipper to only create the ClientFactory once? See this line in skipper.
I think the problem is with creating new request IDs. When the ClientFactory is not reused, I guess that there are new ID pools created on every request?
Explanation
When getting a diff between two goroutine dumps I see many goroutines in newIDs.func1:
goroutine 75451 [chan send, 10 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc001323020, 0xffff)
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:100 +0x78
goroutine 68511 [chan send, 11 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc002043500, 0xffff)
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:100 +0x78
goroutine 53003 [chan send, 12 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc001f52f60, 0xffff)
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:100 +0x78
goroutine 120002 [chan send, 6 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc001a0cba0, 0xffff)
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:100 +0x78
goroutine 169080 [chan send, 2 minutes]:
github.com/yookoala/gofast.newIDs.func1(0xc000039bc0, 0xffff)
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:102 +0x43
created by github.com/yookoala/gofast.newIDs
/go/pkg/mod/github.com/yookoala/[email protected]/client.go:100 +0x78
How to dump
- Configure
-enable-profilefor Skipper - Proxy to a Skipper pod
kubectl port-forward skipper-ingress-pzx2b 9911:9911 - Go to http://127.0.0.1:9911/debug/pprof/ and download a dump
full goroutine stack dump, wait a few minutes, and download another dump - Install
goroutine-inspectand run it$GOPATH/bin/goroutine-inspect - Make a diff, (l = only in original, c = in both, r = in original 2)
original = load("goroutine.dump") original2 = load("goroutine2.dump") l, c, r = original.diff(original2) c.show()
Skipper creates a new client for every request. This is because Skipper handles a lot of different traffic to different servers. So only when the requests comes in, it knows where to send it to. It might be something that we can optimize in the future, but it is what it is right now.
We found out that when gofast creates a new client, it also spawns a goroutine that handles the ID pool. These are the leaking goroutines.
When Skipper sends the request to gofast, it never calls client.Close(). We need to fix that in Skipper for sure, but even if it would properly close the client, gofast wouldn't close the idpool. Therefore, we created a PR to address that issue.
When that is merged and tagged, we can create a PR for Skipper to properly call client.Close() when the request is done.
@adri: The memory leak issue is now tracking at #69.
I think I can stably reproduce this "Invalid memory address or nil pointer dereference" issue with this test:
func TestClient_close_before_readRequest(t *testing.T) {
// Test closing client before writeRequest
//
// Note: closing connection could cause client.conn nil
// writing after close could cause error
// create a temp dummy fastcgi application server
p, err := newAppServer("client.test.sock", func(w http.ResponseWriter, r *http.Request) {
t.Logf("accessing FastCGI process")
fmt.Fprintf(w, "hello world")
})
if err != nil {
t.Fatalf("Unable to start server: %s", err.Error())
}
defer p.Close()
// create reusable client factory
c, err := gofast.SimpleClientFactory(
gofast.SimpleConnFactory(p.Network(), p.Address()),
)()
if err != nil {
t.Fatalf("Unable for client factory to connect to server: %s", err.Error())
}
// Mock creating a gofast.Request that still
// pending to write a body.
pr, pw := io.Pipe()
r, err := http.NewRequest("POST", "/add", pr)
r.Header.Add("Content-Length", "11")
if err != nil {
t.Errorf("unexpected error: %#v", err.Error())
}
writeBody := func() {
c.Close()
pw.Write([]byte("hello world"))
}
req := gofast.NewRequest(r)
// handle the result
go writeBody()
log.Printf("checkpoint 1")
resp, err := c.Do(req)
if err != nil {
t.Fatalf("web server: unable to connect to process request: %s", err.Error())
return
}
w := httptest.NewRecorder()
errBuffer := new(bytes.Buffer)
resp.WriteTo(w, errBuffer)
if want, have := "", w.Body.String(); want != have {
t.Errorf("expected %v, got %v", want, have)
}
}
Still working on way to resolve it.