delve service/dap: fix closing closed channel bug

Our previous implementation of waiting on resumeRequestLoop was not safe, because the channel may be closed several times. This change removes this bug by instead having two channels to track the two conditions that we are managing, whether the command has started running and whether the command has completed (on error or other).

Fixes go-delve/delve#2757

Oct 28 '21 19:10 suzmue

So what changed? The auto-resume logic added signal redirection with an additional goroutine, so now we could have the new goroutine close the channel when Command signaled setup completion and the per-request goroutine close the same channel at the same time when Command quickly returned? Do I have that right?

Yes this does happen because of the auto resume logic. Since we are adding a separate goroutine to wait for resumeNotify, when we actually call s.debugger.Command, the original goroutine and that other goroutine could try to close the channel at the same time.

Nov 01 '21 17:11 suzmue

So what changed? The auto-resume logic added signal redirection with an additional goroutine, so now we could have the new goroutine close the channel when Command signaled setup completion and the per-request goroutine close the same channel at the same time when Command quickly returned? Do I have that right?

Yes this does happen because of the auto resume logic. Since we are adding a separate goroutine to wait for resumeNotify, when we actually call s.debugger.Command, the original goroutine and that other goroutine could try to close the channel at the same time.

The channel that is closed on two different goroutines is allowNextStateChange. At one point we had a discussion that with the introduction of changeStateMu we might not need to block the request loop on allowNextStateChange at all. Have you considered that approach instead of adding another channel?

Nov 01 '21 19:11 polinasok

Just curious on the status of this @polinasok @suzmue.

Jan 27 '22 21:01 derekparker

Should this PR still be considered for the next release?

Feb 22 '22 17:02 derekparker

@derekparker I haven't looked at this in a bit, so I will go ahead and rebase it.

I'm not sure it is a high priority that we need to get it in asap. I don't think we have seen any reports of users actually encountering this bug, so I think it is ok to push it to the next release so @polinasok can come back and review

Feb 23 '22 01:02 suzmue

@suzmue this still needs a rebase. Would we still like to try targeting this for 1.8.4?

May 03 '22 17:05 derekparker

@derekparker Thanks for bumping this. I rebased the change, but need to take a little bit of time to get the context back on this change to address the remaining comments. Will do this by the end of the week.

May 04 '22 20:05 suzmue

We just hit this again. I will take a look shortly.

Aug 11 '22 16:08 polinasok

@suzmue @polinasok is this change still active or should I close it out?

Nov 11 '22 16:11 derekparker

delve delve copied to clipboard

service/dap: fix closing closed channel bug

delve
delve copied to clipboard