sentry-go icon indicating copy to clipboard operation
sentry-go copied to clipboard

Flaky TestContextifyFrames

Open rhcarvalho opened this issue 5 years ago • 3 comments

Running tests several times in a row (-count=N) reviews a failure in TestContextifyFrames .

$ go test
PASS
ok      github.com/getsentry/sentry-go  0.422s
$ go test -count=5000 -failfast
--- FAIL: TestContextifyFrames (0.00s)
    integrations_test.go:197:
        got: []string(nil)
        want: []string{")", "", "// NOTE: if you modify this file, you are also responsible for updating LoC position in Stacktrace tests", "", "func Trace() *Stacktrace {"}
    integrations_test.go:204:
        got: ""
        want: "\treturn NewStacktrace()"
    integrations_test.go:205:
        got: []string(nil)
        want: []string{"}", "", "func RedPkgErrorsRanger() error {", "\treturn BluePkgErrorsRanger()", "}"}
FAIL
exit status 1
FAIL    github.com/getsentry/sentry-go  1.080s

Pending investigation.

rhcarvalho avatar Dec 20 '19 15:12 rhcarvalho

Global state seems to play a role here.

Running:

go test -count=150000 -failfast -run '^TestContextifyFrames$' 

Could not reproduce the failure, while running all tests reproduces it with a notable consistency.

rhcarvalho avatar Dec 29 '19 23:12 rhcarvalho

In sourceReader.readContextLines we always do the same regardless of the error from ioutil.ReadFile.

https://github.com/getsentry/sentry-go/blob/a1aa34ce36838f6349c403fd2b195a642421fe67/sourcereader.go#L27-L30

Sometimes the error can be transient:

diff --git a/sourcereader.go b/sourcereader.go
index 2d35768..f2f552f 100644
--- a/sourcereader.go
+++ b/sourcereader.go
@@ -2,7 +2,10 @@ package sentry

 import (
        "bytes"
+       "fmt"
        "io/ioutil"
+       "os"
+       "strings"
        "sync"
 )

@@ -26,6 +29,9 @@ func (sr *sourceReader) readContextLines(filename string, line, context int) (>
        if !ok {
                data, err := ioutil.ReadFile(filename)
                if err != nil {
+                       if !strings.Contains(err.Error(), "no such file or directory") {
+                               fmt.Fprintln(os.Stderr, "->", err)
+                       }
                        sr.cache[filename] = nil
                        return nil, 0
                }
$ go test -count=5000 -failfast
-> open /[...]/sentry-go/errors_test.go: too many open files
--- FAIL: TestContextifyFrames (0.00s)
[...]

rhcarvalho avatar Dec 30 '19 00:12 rhcarvalho

It is possible to instrument tests to print a list of open file descriptors on failure:

diff --git a/integrations_test.go b/integrations_test.go
index 86719e3..1dd763d 100644
--- a/integrations_test.go
+++ b/integrations_test.go
@@ -1,9 +1,13 @@
 package sentry
 
 import (
+	"os"
+	"os/exec"
 	"path/filepath"
 	"regexp"
+	"strconv"
 	"testing"
+	"time"
 )
 
 func TestTransformStringsIntoRegexps(t *testing.T) {
@@ -209,6 +213,32 @@ func TestContextifyFrames(t *testing.T) {
 		"\treturn BluePkgErrorsRanger()",
 		"}",
 	})
+
+	if t.Failed() {
+		lsof := func() error {
+			b, err := exec.Command("lsof", "-p", strconv.Itoa(os.Getpid())).CombinedOutput()
+			if err != nil {
+				return err
+			}
+			t.Logf("lsof:\n%s", b)
+			return nil
+		}
+		start := time.Now()
+		ticker := time.NewTicker(200 * time.Millisecond)
+		defer ticker.Stop()
+		abort := time.After(10 * time.Second)
+		for {
+			select {
+			case <-abort:
+				t.Fatalf("lsof: aborted after %v", time.Since(start))
+			case <-ticker.C:
+				err := lsof()
+				if err == nil {
+					t.Fatalf("lsof: success after %v", time.Since(start))
+				}
+			}
+		}
+	}
 }
 
 func TestContextifyFramesNonexistingFilesShouldNotDropFrames(t *testing.T) {

After some investigation, I believe the root cause of the test failure is not within TestContextifyFrames itself, it just happens to be visible there. Running many tests creates a lot of pending HTTP requests that take up file descriptors until they run out.

Problems:

  • [ ] tests should clean up resources when they're done with their Client values: pending Client.Close() implementation #111.
  • [ ] tests should not open connections to external servers: use net/http/httptest.Server and point test DSNs to it where appropriate.

rhcarvalho avatar Dec 30 '19 17:12 rhcarvalho

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

github-actions[bot] avatar Dec 07 '22 09:12 github-actions[bot]