Dragonfly2
Dragonfly2 copied to clipboard
syncProgress hangs when grpc context closes
Bug report:
The download requests hangs occasionally with full cpu usage, due to syncProgress
running into busy foo-loop when context closes.
pprof:
File: agent
Type: cpu
Time: Aug 1, 2024 at 11:53pm (CST)
Duration: 30.19s, Total samples = 59.73s (197.82%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 57010ms, 95.45% of 59730ms total
Dropped 122 nodes (cum <= 298.65ms)
Showing top 10 nodes out of 21
flat flat% sum% cum cum%
20670ms 34.61% 34.61% 20670ms 34.61% runtime.procyield
12200ms 20.43% 55.03% 38720ms 64.83% runtime.lock2
6130ms 10.26% 65.29% 56600ms 94.76% runtime.selectgo
4170ms 6.98% 72.28% 6550ms 10.97% runtime.unlock2
3580ms 5.99% 78.27% 3580ms 5.99% runtime.osyield
3460ms 5.79% 84.06% 3460ms 5.79% runtime.futex
2590ms 4.34% 88.40% 41330ms 69.19% runtime.sellock
2460ms 4.12% 92.52% 59590ms 99.77% d7y.io/dragonfly/v2/client/daemon/peer.(*fileTask).syncProgress
1100ms 1.84% 94.36% 7670ms 12.84% runtime.selunlock
650ms 1.09% 95.45% 650ms 1.09% runtime.cheaprand (inline)
(pprof) list syncProgress
Total: 59.73s
ROUTINE ======================== d7y.io/dragonfly/v2/client/daemon/peer.(*fileTask).syncProgress in /Users/root/go/pkg/mod/git.garena.com/shopee/search_recommend/engine/data-deliver/third-party/dragonfly2/[email protected]/client/daemon/peer/peertask_file.go
2.46s 59.59s (flat, cum) 99.77% of Total
. . 123:func (f *fileTask) syncProgress() {
. . 124: defer f.span.End()
. . 125: for {
170ms 56.78s 126: select {
120ms 120ms 127: case <-f.peerTaskConductor.successCh:
. . 128: f.storeToOutput()
. . 129: return
40ms 40ms 130: case <-f.peerTaskConductor.failCh:
. . 131: f.span.RecordError(fmt.Errorf(f.peerTaskConductor.failedReason))
. . 132: f.sendFailProgress(f.peerTaskConductor.failedCode, f.peerTaskConductor.failedReason)
. . 133: return
1.98s 2.48s 134: case <-f.ctx.Done():
150ms 170ms 135: case piece := <-f.pieceCh:
. . 136: if piece.Finished {
. . 137: continue
. . 138: }
. . 139: pg := &FileTaskProgress{
. . 140: State: &ProgressState{
https://github.com/dragonflyoss/Dragonfly2/blob/97f21cfbf5f37f131c4f34d6f3efb0410d1447f5/client/daemon/peer/peertask_file.go#L123-L162
Expected behavior:
The syncProgress should return when grpc sctx closes.
How to reproduce it:
Environment:
- Dragonfly version: v2.1.0-4349e27
- OS: ubuntu
- Kernel (e.g.
uname -a
): - Others: