Improve `Go` dependency parsing
Overview
The current lockfile generator creates a go.sum file and parses the output for dependencies. However the go.sum contents is not really a lockfile and analysis results are skewed due to these additional versions even if they aren't used to build the application.
Update the go.sum parser to ignore hashes for /go.mod. These only indicate that a package was considered and not that the package is actually used. There will still be too many dependencies but should have improved results.
Newer go projects supporting go directive 1.17 and higher can use the go.mod parser.
Acceptance Criteria
- [ ] Update the go.sum parser to ignore hashes for /go.mod.
- [ ] Replace the Go lockfile generator with a go.mod parser which only works for files with a go directive of 1.17 or higher.
How do you envision this to be done? Our lockfile analysis cannot run arbitrary commands. While we could use this for lockfile generation, I don't see us being able to use this for lockfile analysis.
Generally speaking it doesn't sound like this output is ever intended to land in a file, which is a prerequisite for lockfile parsing.
While we could use this for lockfile generation, I don't see us being able to use this for lockfile analysis.
That's exactly the plan. The Go lockfile generator currently uses go get -d. The plan is to update that generator to use go list instead.
Generally speaking it doesn't sound like this output is ever intended to land in a file
That is true, but the output contains enough information to generate a virtual go.sum file, just like our pip generator does with pip install --dry-run to requirements.txt
Yeah I apologize I must have been completely blind when reading this yesterday. It states clearly multiple times that this refers to lockfile generation, so not sure why I thought this had anything to do with our normal lockfile analysis.
I've been doing some research on this... Go is complicated and the behavior has changed a lot... Even go list has issues because it does not respect module pruning.
Proposal
To improve our handling, I think we can do a few things:
- Update the
go.sumparser to ignore hashes for/go.mod. These only indicate that a package was considered and not that the package is actually used. - Replace the Go lockfile generator with a
go.modparser which only works for files with agodirective of 1.17 or higher.
The go.sum parser may still include too many dependencies, but it should be much better. And for projects that can use the go.mod parser instead, they should get a dependency list that accurately reflects go's lazy module loading and module pruning behavior