revive icon indicating copy to clipboard operation
revive copied to clipboard

Different results when removing the GOCACHE and pkgs directories

Open zimmski opened this issue 3 years ago • 14 comments

Describe the bug

This is more of a "i need a better idea on how to debug this" bug reports because i do have a reproducer, but it involves our monorepo (which i can't just send out).

The bug of this issue is that the output of revive is not deterministic for one specific problem. The following Revive config enables the rule that finds the reported problem.

[rule.range-val-address]
        Disabled = false

Which leads to the following problem in the reproducer i am currently reducing:

tt.go:20:15: suspicious assignment of 'p'. range-loop variables always have the same address

The problem appears deterministically in our CI but it does not appear locally on my machine right away. I need to do some steps to reproduce it but then it is again deterministically reporting the problem.

revive --config config.toml tt.go # Does not report a problem.
rm -rf $GOCACHE pkg/
revive --config ok.toml tt.go # Suddenly reports a problem.
go install $our-project/...
revive --config config.toml tt.go # Does not report a problem.

First i thought it is a race, but when i compile Revive with the race detector on, there is no race anymore with the latest changes. So i ran out of ideas on what the problem is.

One idea i had was that this must have something to do with how packages are loaded to resolve types because i reduce the code down to:

func prepareTestsForValidation(tests []*model.ProjectTest) {
        for _, test := range tests {
                for _, p := range test.Result.Problems {
                        problem := &p.Problem
                        problem.Message = "abc"
                }
        }
}

"model" is an import identifier. When i copy the types directly into the same file as the reduce code, Revive does not report a problem anymore.

Expected behavior

I would expected to always have the same problem.

Desktop:

  • OS: SUSE LEAP 15.3 locally and Ubuntu 20.04 in the CI which runs in Kubernetes+latest Docker
  • Version of Go: 1.18.0

zimmski avatar Apr 10 '22 10:04 zimmski

Same with Go 1.17.8

zimmski avatar Apr 10 '22 10:04 zimmski

OK i think i have a reproducer now at least for the reported problem. It does disappear when one includes all code in one file.

zimmski avatar Apr 10 '22 10:04 zimmski

go.mod

module example.com/m

go 1.18

model/mode.go

package model

// ProjectTest foobar
type ProjectTest struct {
	Result *ProjectTestResult
}

// ProjectTestResult foobar
type ProjectTestResult struct {
	Problems []*AnalyzeProblem
}

// AnalyzeProblem foobar
type AnalyzeProblem struct {
	Problem
}

// Problem foobar
type Problem struct {
	Message string
}

valid/valid.go

package valid

import (
	"example.com/m/model"
)

// PrepareTestsForValidation foobar
func PrepareTestsForValidation(tests []*model.ProjectTest) {
	for _, test := range tests {
		for _, p := range test.Result.Problems {
			problem := &p.Problem

			problem.Message = "abc"
		}
	}
}

Doing Revive on this code:

revive -config ~/symflower/ok.toml ./...
valid/valid.go:11:15: suspicious assignment of 'p'. range-loop variables always have the same address

But when valid/valid.go is merging everything into one file:

package valid

// ProjectTest foobar
type ProjectTest struct {
	Result *ProjectTestResult
}

// ProjectTestResult foobar
type ProjectTestResult struct {
	Problems []*AnalyzeProblem
}

// AnalyzeProblem foobar
type AnalyzeProblem struct {
	Problem
}

// Problem foobar
type Problem struct {
	Message string
}

// PrepareTestsForValidation foobar
func PrepareTestsForValidation(tests []*ProjectTest) {
	for _, test := range tests {
		for _, p := range test.Result.Problems {
			problem := &p.Problem

			problem.Message = "abc"
		}
	}
}

There is no problem anymore:

revive -config ~/symflower/ok.toml ./...

zimmski avatar Apr 10 '22 10:04 zimmski

Narrowed this down to ".TypeCheck()" throwing an error which is mostly never checked in the repository.

The error is something like panic: tt.go:3:8: could not import xyz/model (can't find import: xyz/model)

So the reason for throwing that error is that the type information is not present? My guess is that the type information is loaded from the binary-files which makes it clearer why deleting the GOCACHE/pkgs directories lead to a reported problem.

If that is true, two things need fixing(?)

  • Errors of ".TypeCheck()" should be checked and reported
  • Type-loading should happen even if GOCACHE/pkgs is empty <- i see that "gcexportdata" is used(?), i don't know how that API works TBH.

zimmski avatar Apr 10 '22 19:04 zimmski

This issue may seem to also affect us.

In CI we are running both go test and revive (via golangci-lint linter suite) in parallel, because doing so substantially reduces build time. It seems go test updates caches which revive simultaneously uses, because we sometimes get non-deterministic complaints from revive in several locations, but only from revive, never from any other linter in the suite.

If we run revive after go test finishes, there is never a complaint. IDK if running in parallel supported, but this works for other linters and helps reducing build time.

hrissan avatar Jun 16 '22 14:06 hrissan

@zimmski the root cause is that revive loaded the type info by each package, so it can't load the model object since there are in different packages. I will try to fix it by searching the type info in all packages instead of current file package. But it still can't solve the issue when some the model packages are outside the lint project.

How do you think about this solution? @mgechev @chavacava

git-hulk avatar Jul 13 '22 05:07 git-hulk

Sounds great, looking forward to your fix :-) Ping me when i should test it with our CI.

zimmski avatar Jul 13 '22 09:07 zimmski

Type information of each package is calculated lazily. The calculation relies on the GO's type package (go/types.Config.Check) My understanding of @git-hulk proposition: to make the type calculation greedy (instead of lazy) and to make all type information accessible from any package. Is my understanding right?

chavacava avatar Jul 13 '22 12:07 chavacava

Type information of each package is calculated lazily. The calculation relies on the GO's type package (go/types.Config.Check) My understanding of @git-hulk proposition: to make the type calculation greedy (instead of lazy) and to make all type information accessible from any package. Is my understanding right?

Yes, but I found that it still didn't work after loading the package first, need do more investigations on this issue.

git-hulk avatar Jul 13 '22 16:07 git-hulk

EDIT: Disregard. In my case the nondeterminism was caused by my not using the --uniq-by-line=false golangci-lint flag.

~For anyone using golangci-lint with both revive and ireturn linters enabled, I've found that this combination induces this non-determinism.~

~Using golangci-lint v1.47.1~

~Repro steps:~

golangci-lint run --no-config --disable-all -E revive -E ireturn --out-format github-actions | sort > 1.txt
golangci-lint run --no-config --disable-all -E revive -E ireturn --out-format github-actions | sort > 2.txt
diff 1.txt 2.txt

armsnyder avatar Jul 23 '22 07:07 armsnyder

This issue should can be fixed after #716 since it'll load all packages.

git-hulk avatar Jul 26 '22 06:07 git-hulk

This issue should can be fixed after #716 since it'll load all packages.

Yes. The packages management will change completely and it is possible #716 is no longer an issue.

chavacava avatar Jul 26 '22 09:07 chavacava

Cool, can do many checks after this PR since we have enough type infos. I'm wondering if we can introduce go/packages when try to fix this issue because this will break many test cases.

Anyway, I like this change, also can have a hand on fixing those testdata if you need.

git-hulk avatar Jul 26 '22 11:07 git-hulk

Yes, type info allows to implement some fancy checks but nothing is free... type info calculation takes time. The idea (as is the case in the current version of revive) is to only trigger type info calculation if there are activated rules depending on that info.

Anyway, I like this change, also can have a hand on fixing those testdata if you need.

Thanks! At some point I will ask for help on fixing some things like testcases.

chavacava avatar Jul 26 '22 11:07 chavacava