Fixed Concurrency Issue: channel communication of `RetryDo`
Summary
This PR fixes a concurrency related bug in your codebase.
Description
While triaging your project, our bug fixing tool generated the following message(s)-
In File: retry.go, In function
RetryDo$1$1, there is a send operation. But no matching receive operation is found on that channel from any other goroutine.
After triaging, we found that there's a send operation to channel panicSignal in the defer function. And in the RetryDo() method, there's a select statement with signal on timer.C and receive on panicSignal. It seems correct except for a corner case like-
The receive on panicSignal and the signal on timer.C may come at the same moment. In that case, the select will randomly choose one of those two receives. From documentation -
If one or more of the communications can proceed, a single one that can proceed is chosen via a uniform pseudo-random selection. Otherwise, if there is a default case, that case is chosen. If there is no default case, the "select" statement blocks until at least one of the communications can proceed.
So, if select chooses the timer.C case, the send operation on panicSignal channel will be blocked and the routine will remain halted. The routine will expect another routine to receive from that channel, since it's unbuffered.
Solution / Suggested Changes
As a solution, we introduce a new channel called quit, which is put inside a select along with the panicSignal send operation. And in the RetryDo() method, once we receive the timeout, and select chooses to execute the case- we send a signal to quit. Which is then received inside the select block stated above. The select executes the operation on quit and the send on panicSignal is never executed.
Previously Found & Fixed
Below is a list of open-source projects where this same bug was found and fixed-
- https://www.github.com/kubernetes/kubeadm/pull/2983
CLA Requirements
This section is only relevant if your project requires contributors to sign a Contributor License Agreement (CLA) for external contributions.
All contributed commits are already automatically signed off.
The meaning of a signoff depends on the project, but it typically certifies that committer has the rights to submit this work under the same license and agrees to a Developer Certificate of Origin (see https://developercertificate.org/ for more information).
Sponsorship and Support
This work is done by the security researchers from OpenRefactory and is supported by the Open Source Security Foundation (OpenSSF): Project Alpha-Omega. Alpha-Omega is a project partnering with open source software project maintainers to systematically find new, as-yet-undiscovered vulnerabilities in open source code - and get them fixed – to improve global software supply chain security.
The bug is found by running the Intelligent Code Repair (iCR) tool by OpenRefactory and then manually triaging the results.