go
go copied to clipboard
x/tools/gopls: Possible deadlock occurs between `*cmdClient.filesMu` and `*Server.diagnosticsMu`
gopls version
golang/tools tag: gopls/v0.9.5
go env
go 1.18
What did you do?
In my use, it only happened once that there was no response for a long time, so I started to investigate.
I added pprof for gopls, and added the replacement into gopls/go.mod:replace golang.org/x/tools => ../../tools.
Then I found out a deadlock in references, here is the full stack:(I mark the **Step to highlight the key layer. Goroutine 1 and 38088 provide key information.)
goroutine 35138 [running]:
runtime/pprof.writeGoroutineStacks({0x1c86e20, 0xc0b59d22a0})
/usr/local/go/src/runtime/pprof/pprof.go:694 +0x70
runtime/pprof.writeGoroutine({0x1c86e20?, 0xc0b59d22a0?}, 0x0?)
/usr/local/go/src/runtime/pprof/pprof.go:683 +0x2b
runtime/pprof.(*Profile).WriteTo(0x19a12e0?, {0x1c86e20?, 0xc0b59d22a0?}, 0xc?)
/usr/local/go/src/runtime/pprof/pprof.go:332 +0x14b
net/http/pprof.handler.ServeHTTP({0xc00305bf01, 0x9}, {0x1c8c9c0, 0xc0b59d22a0}, 0x736f686c61636f6c?)
/usr/local/go/src/net/http/pprof/pprof.go:253 +0x4a5
net/http/pprof.Index({0x1c8c9c0?, 0xc0b59d22a0}, 0xc0243ce600)
/usr/local/go/src/net/http/pprof/pprof.go:371 +0x13e
net/http.HandlerFunc.ServeHTTP(0xa6491b8?, {0x1c8c9c0?, 0xc0b59d22a0?}, 0x100eb05?)
/usr/local/go/src/net/http/server.go:2084 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc00305bf0b?, {0x1c8c9c0, 0xc0b59d22a0}, 0xc0243ce600)
/usr/local/go/src/net/http/server.go:2462 +0x149
net/http.serverHandler.ServeHTTP({0xc05e4fbdd0?}, {0x1c8c9c0, 0xc0b59d22a0}, 0xc0243ce600)
/usr/local/go/src/net/http/server.go:2916 +0x43b
net/http.(*conn).serve(0xc05c142820, {0x1c8d188, 0xc0004b7dd0})
/usr/local/go/src/net/http/server.go:1966 +0x5d7
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:3071 +0x4db
goroutine 1 [semacquire]:
sync.runtime_SemacquireMutex(0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/sema.go:71 +0x25
sync.(*Mutex).lockSlow(0xc0004b8518)
/usr/local/go/src/sync/mutex.go:162 +0x165
sync.(*Mutex).Lock(...)
/usr/local/go/src/sync/mutex.go:81
golang.org/x/tools/internal/lsp.(*Server).mustPublishDiagnostics(0xc0004b8480, {0xc03f258af0, 0x63})
**(Step 2.a. `AddFile` pending on `s.diagnosticsMu`)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:392 +0x67
golang.org/x/tools/internal/lsp.(*Server).processModifications(0xc0004b8480, {0x1c8d188, 0xc0005a26f0}, {0xc08ad274f0, 0x1, 0x1}, 0x0, 0xc040be41e0)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/text_synchronization.go:313 +0x4dd
golang.org/x/tools/internal/lsp.(*Server).didModifyFiles(0xc0004b8480, {0x1c8d188, 0xc0005a26f0}, {0xc08ad274f0?, 0x1, 0x1}, 0x0)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/text_synchronization.go:225 +0x58d
golang.org/x/tools/internal/lsp.(*Server).didOpen(0xc0004b8480, {0x1c8d188, 0xc0005a26f0}, 0xc030104e40)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/text_synchronization.go:94 +0x27a
golang.org/x/tools/internal/lsp.(*Server).DidOpen(0xc00022fe00?, {0x1c8d188?, 0xc0005a26f0?}, 0xc03f258a80?)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/server_gen.go:92 +0x25
golang.org/x/tools/internal/lsp/cmd.(*connection).AddFile(0xc00000e690, {0x1c8d188, 0xc0005a26f0}, {0xc03f258a80, 0x63})
**(Step 1.a. `AddFile` locked `c.Client.filesMu`)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cmd/cmd.go:581 +0x208
golang.org/x/tools/internal/lsp/cmd.(*references).Run.func1(0x0?, 0xc038ec17d8, 0xc038ec17f8, 0x100ee67?, 0xc00000e690, {0x1c8d188, 0xc0005a26f0}, 0xc000572cd0)
**(I add more "files" for "AddFile" function here to reproduce the deadlock) /Users/lyc/goworkspace/src/github.com/golang/tools/gopls/internal/lsp/cmd/references.go:122 +0x8e5
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/tool/tool.go:168 +0x57f
golang.org/x/tools/internal/lsp/cmd.(*Application).Run(0xc000396640, {0x1c8d118?, 0xc00003c0b0?}, {0xc0000321c0?, 0x2, 0x2})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cmd/cmd.go:227 +0x408
golang.org/x/tools/internal/tool.Run({0x1c8d118, 0xc00003c0b0}, 0xc0005a4180, {0x1c8dce8?, 0xc000396640}, {0xc0000321c0, 0x2, 0x2})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/tool/tool.go:168 +0x57f
golang.org/x/tools/internal/tool.Main({0x1c8d118, 0xc00003c0b0}, {0x1c8dce8?, 0xc000396640?}, {0xc0000321c0, 0x2, 0x2})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/tool/tool.go:92 +0x145
main.main()
/Users/lyc/goworkspace/src/github.com/golang/tools/gopls/main.go:30 +0xa5
goroutine 129 [semacquire]:
sync.runtime_SemacquireMutex(0xc0002e9560?, 0xd0?, 0x0?)
/usr/local/go/src/runtime/sema.go:71 +0x25
sync.(*Mutex).lockSlow(0xc0004b8518)
/usr/local/go/src/sync/mutex.go:162 +0x165
sync.(*Mutex).Lock(...)
/usr/local/go/src/sync/mutex.go:81
golang.org/x/tools/internal/lsp.(*Server).storeDiagnostics(0xc0004b8480, {0x1c95a10, 0xc0000c0500}, {0xc0002b4190, 0x50}, 0x1?, {0x21e7d28, 0x0, 0x0?})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:414 +0xc5
golang.org/x/tools/internal/lsp.(*Server).diagnose(0xc0004b8480, {0x1c8d8c0, 0xc00006e900}, {0x1c95a10, 0xc0000c0500}, 0x0)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:239 +0x706
golang.org/x/tools/internal/lsp.(*Server).diagnoseDetached(0xc0000d82c0?, {0x1c95a10, 0xc0000c0500})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:120 +0x78
golang.org/x/tools/internal/lsp.(*Server).addFolders.func3()
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/general.go:286 +0x45
created by golang.org/x/tools/internal/lsp.(*Server).addFolders
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/general.go:285 +0xafb
goroutine 119 [IO wait]:
internal/poll.runtime_pollWait(0xa649b18, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc0000b5c80?, 0x4?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Accept(0xc0000b5c80)
/usr/local/go/src/internal/poll/fd_unix.go:614 +0x22c
net.(*netFD).accept(0xc0000b5c80)
/usr/local/go/src/net/fd_unix.go:172 +0x35
net.(*TCPListener).accept(0xc0003fc6a8)
/usr/local/go/src/net/tcpsock_posix.go:139 +0x28
net.(*TCPListener).Accept(0xc0003fc6a8)
/usr/local/go/src/net/tcpsock.go:288 +0x3d
net/http.(*Server).Serve(0xc00019a1c0, {0x1c8c7b0, 0xc0003fc6a8})
/usr/local/go/src/net/http/server.go:3039 +0x385
net/http.(*Server).ListenAndServe(0xc00019a1c0)
/usr/local/go/src/net/http/server.go:2968 +0x7d
created by golang.org/x/tools/internal/lsp/cmd.init.0
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cmd/pprof.go:16 +0xd8
goroutine 38135 [semacquire]:
sync.runtime_Semacquire(0xc030070d40?)
/usr/local/go/src/runtime/sema.go:56 +0x25
sync.(*WaitGroup).Wait(0xc022d4cf38?)
/usr/local/go/src/sync/waitgroup.go:136 +0x52
golang.org/x/tools/internal/lsp.(*Server).diagnoseSnapshots(0xc0004b8480, 0xc03b836f10?, 0x0)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:133 +0x1f6
golang.org/x/tools/internal/lsp.(*Server).processModifications.func1()
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/text_synchronization.go:319 +0x37
created by golang.org/x/tools/internal/lsp.(*Server).processModifications
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/text_synchronization.go:318 +0x3db
goroutine 17230 [semacquire, 1 minutes]:
sync.runtime_Semacquire(0x10?)
/usr/local/go/src/runtime/sema.go:56 +0x25
sync.(*WaitGroup).Wait(0x1778ee6?)
/usr/local/go/src/sync/waitgroup.go:136 +0x52
golang.org/x/tools/internal/lsp/cache.(*snapshot).destroy(0xc0000c0500, {0x1a39021, 0x16})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cache/snapshot.go:206 +0x92
golang.org/x/tools/internal/lsp/cache.(*View).destroy.func1()
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cache/snapshot.go:200 +0x6d
created by golang.org/x/tools/internal/lsp/cache.(*View).destroy
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cache/snapshot.go:198 +0xc5
goroutine 38357 [IO wait]:
internal/poll.runtime_pollWait(0xa6491b8, 0x72)
/usr/local/go/src/runtime/netpoll.go:302 +0x89
internal/poll.(*pollDesc).wait(0xc05e502080?, 0xc05e4fbde1?, 0x0)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:83 +0x32
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
internal/poll.(*FD).Read(0xc05e502080, {0xc05e4fbde1, 0x1, 0x1})
/usr/local/go/src/internal/poll/fd_unix.go:167 +0x25a
net.(*netFD).Read(0xc05e502080, {0xc05e4fbde1?, 0xe?, 0x0?})
/usr/local/go/src/net/fd_posix.go:55 +0x29
net.(*conn).Read(0xc05de7f910, {0xc05e4fbde1?, 0x1785745?, 0xc0301c96c0?})
/usr/local/go/src/net/net.go:183 +0x45
net/http.(*connReader).backgroundRead(0xc05e4fbdd0)
/usr/local/go/src/net/http/server.go:672 +0x3f
created by net/http.(*connReader).startBackgroundRead
/usr/local/go/src/net/http/server.go:668 +0xca
goroutine 38088 [semacquire]:
sync.runtime_SemacquireMutex(0xc0248af928?, 0x9a?, 0xc0248af978?)
/usr/local/go/src/runtime/sema.go:71 +0x25
sync.(*Mutex).lockSlow(0xc00022fe30)
/usr/local/go/src/sync/mutex.go:162 +0x165
sync.(*Mutex).Lock(...)
/usr/local/go/src/sync/mutex.go:81
golang.org/x/tools/internal/lsp/cmd.(*cmdClient).PublishDiagnostics(0xc00022fe00, {0x1c8d188, 0xc031a5a8d0}, 0xc031a5aa20)
**(Step 2.b. `PublishDiagnostics` pending on `c.filesMu`)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cmd/cmd.go:511 +0xe5
golang.org/x/tools/internal/lsp.(*Server).publishDiagnostics(0xc0004b8480, {0x1c8d188, 0xc031dd7b90}, 0x0, {0x1c95a10, 0xc0156b5e00})
**(Step 1.b. `publishDiagnostics` locked `s.diagnosticsMu`)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:599 +0x58e
golang.org/x/tools/internal/lsp.(*Server).diagnoseSnapshot(0xc0004b8480, {0x1c95a10, 0xc0156b5e00}, {0xc016be8330, 0x1, 0x1}, 0xa7?)
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:151 +0x22b
golang.org/x/tools/internal/lsp.(*Server).diagnoseSnapshots.func1({0x1c95a10?, 0xc0156b5e00?}, {0xc016be8330?, 0xc0000d2820?, 0xc0233a4fb8?})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:130 +0x85
created by golang.org/x/tools/internal/lsp.(*Server).diagnoseSnapshots
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/diagnostics.go:128 +0x90
goroutine 38136 [semacquire]:
sync.runtime_Semacquire(0x10?)
/usr/local/go/src/runtime/sema.go:56 +0x25
sync.(*WaitGroup).Wait(0x21bfca0?)
/usr/local/go/src/sync/waitgroup.go:136 +0x52
golang.org/x/tools/internal/lsp/cache.(*snapshot).destroy(0xc0156b5e00, {0x1a39021, 0x16})
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cache/snapshot.go:206 +0x92
golang.org/x/tools/internal/lsp/cache.(*View).destroy.func1()
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cache/snapshot.go:200 +0x6d
created by golang.org/x/tools/internal/lsp/cache.(*View).destroy
/Users/lyc/goworkspace/src/github.com/golang/tools/internal/lsp/cache/snapshot.go:198 +0xc5
The four steps marked above:
Step 1.a. AddFile locked c.Client.filesMu & Step 1.b. publishDiagnostics locked s.diagnosticsMu
Step 2.a. AddFile pending on s.diagnosticsMu & Step 2.b. PublishDiagnostics pending on c.filesMu
The deadlock happend in goroutine 1 and 38088. Goroutine 38088 created from go1: AddFile -> DidOpen -> didOpen -> didModifyFiles -> processModifications:318. BYW goroutine 38088 is created in last AddFile, but goroutine 1 is pending in currrent AddFile.
The original tools/internal/lsp/cmd/references.go has more than one AddFile call both from args[0] and locations below. So I add mach more different locations to recurrent this. (Maybe there are other better ways?)
I think such a problem can happen just by calling this AddFile many times quickly.
In summary, before the goroutine 38088 starts, although the go is used to start async to ensure that the current AddFile releases fileMu through defer, but the new process has to grab fileMu again later. So if another AddFile starts soon, it will deadlock.
What did you expect to see?
Get result quickly.
What did you see instead?
Endless waiting.
Editor and settings
Logs
Thank you for the detailed analysis! We will fix.
Aside: do you use the command-line interface often? For what use-case, if you don't mind sharing?
I developed a tool to scan dead code based on gopls - definition , implementation and references. But I am very sorry that it cannot be open source at present. Maybe it can be open source in the future.
As for the case, it only happens once for me. Maybe we can infer from the code whether it is possible to have this problem? As you know, the deadlock problem is not easy to reproduce, which is why I mentioned the possibility in this issue topic.
Thanks.
Change https://go.dev/cl/446856 mentions this issue: gopls/internal/lsp/cmd: fix deadlock when opening a file