pulumi icon indicating copy to clipboard operation
pulumi copied to clipboard

Consider explicit `go build` instead of `go run` for running Go programs

Open t0yv0 opened this issue 3 years ago • 1 comments

Hello!

  • Vote on this issue by adding a 👍 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

Consider removing go run style code execution in favor of go build followed by running the generated program.

As of pulumi 3.37.2 pulumi-language-go has two modes of executing user code, with go run . being by far the most common:

  • go run .
  • ./$binary if .runtime.binary option is set in Pulumi.yaml

The execution happens every time Pulumi interacts with the user program (preview, up, etc), every command ends up calling into this.

The suggestion is to replace go run . with two steps, managing a local binary file explicitly:

go build -o $program ./$program

Benefits

  1. Faster execution. go build skips linking when unnecessary, while go run always relinks. This can be tested with go build -n and go run -n where the -n previews commands instead of executing them. Linking costs are significant if projects use large dependencies such as azure-native. These projects would receive a speed up.

  2. Better diagnostics. Currently go run hijacks exit codes, see for example https://github.com/golang/go/issues/17813 ; namely it returns exit code 0 for ok, 1 for program error, and 2 for compilation error. If the program exits with a non-0 exit code, instead of returning that code it prints it "exit code 3" for example. Not using go run would make it easy for Pulumi to distinguish between compile and runtime errors and print them nicely, and suppress redundant "exit code 3" messages from output (original discussion in https://github.com/pulumi/pulumi/pull/10265).

  3. Better execution speed information in the metrics. We would be able to put separate spans and measure compilation/linking vs raw execution time to better identify where slowness is introduced.

Downsides

Pulumi would need to introduce and manage an executable file somewhere. This could be pwd or it could be in tempdir or some other dir managed by Pulumi. There are downsides to additional complexity here:

  • files in pwd will be visible to the users and might need updates to .gitignore and so on to work with source control

  • files in out of band folders will leak disk space, it is unclear when Pulumi can safely delete them; perhaps eventually this would warrant something like docker system prune to remove all of those

Benchmarking speedup potential

$ pulumi new azure-native-go
$ go build -o proj # initial build

$ hyperfine -i 'go run .'                                                                                                     
Benchmark 1: go run .
  Time (mean ± σ):      1.276 s ±  0.036 s    [User: 1.732 s, System: 0.963 s]
  Range (min … max):    1.233 s …  1.333 s    10 runs
 
  Warning: Ignoring non-zero exit code.

$ hyperfine 'go build -o proj' 
Benchmark 1: go build -o proj
  Time (mean ± σ):     228.1 ms ±   2.5 ms    [User: 762.7 ms, System: 782.0 ms]
  Range (min … max):   224.5 ms … 233.2 ms    12 runs
 
$ hyperfine -i './proj' -n 20   
Benchmark 1: 20
  Time (mean ± σ):      12.3 ms ±   1.7 ms    [User: 7.0 ms, System: 3.3 ms]
  Range (min … max):    11.6 ms …  34.1 ms    175 runs
 
  Warning: Ignoring non-zero exit code.
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. 

So the new method would result in a warm start (after initial build) of 250ms for a Pulumi project using azure-native compared to the 1.2s baseline with go run. It's common for users to run several operations that don't result in relinking the binary so these savings can be meaningful.

Affected area/feature

Go language support

t0yv0 avatar Aug 10 '22 20:08 t0yv0

I can't recall what issue I hit but I'm sure I had another reason for wanting to look at using go build instead of run... Will have to rack my brains, but +1 to this idea anyway

Frassle avatar Aug 10 '22 21:08 Frassle

one of the concerns with this is when a directory is not readable as we are outputting a new file.

if a company has a policy for securing docker images by disabling write permissions to directories, they may be unable to upgrade pulumi-language-go without it breaking.

dixler avatar Aug 12 '22 19:08 dixler

if a company has a policy for securing docker images by disabling write permissions to directories, they may be unable to upgrade pulumi-language-go without it breaking.

Isn't tmp always writable, "go run" itself needs to write to a temporary file to work, so I think we can assume that's safe.

Frassle avatar Aug 12 '22 21:08 Frassle

Also there is~/.pulumi, which currently manages plugins. Could be something like ~/.pulumi/go-compiler-cache/$(hash-of-current-folder)

t0yv0 avatar Aug 12 '22 21:08 t0yv0

Could be something like ~/.pulumi/go-compiler-cache/$(hash-of-current-folder)

I had suggested to just use temp (as that's what go run does) but this is also an interesting idea. Might be worth looking into what's need to get build caching from the go compiler?

Frassle avatar Aug 12 '22 21:08 Frassle

The advantage of ~/.pulumi is then we can do pulumi prune command to get rid of that. I don't know when we'd know automatically that it is a good time to delete these binaries; these are not really temp files we know when to delete. In the limit, it can leak temp disk space if we never delete nor provide any means to do so.

We wouldn't need anything special from the compiler here just:

go build -o ~/.pulumi/go-compiler-cache/$(hash-of-current-folder)/$(name-of-program)
~/.pulumi/go-compiler-cache/$(hash-of-current-folder)/$(name-of-program) # execute

t0yv0 avatar Aug 12 '22 21:08 t0yv0

if a company has a policy for securing docker images by disabling write permissions to directories, they may be unable to upgrade pulumi-language-go without it breaking.

Isn't tmp always writable, "go run" itself needs to write to a temporary file to work, so I think we can assume that's safe.

/tmp exists on windows?

dixler avatar Aug 12 '22 22:08 dixler

There are cross-platform Go APIs e.g. https://gobyexample.com/temporary-files-and-directories

t0yv0 avatar Aug 15 '22 14:08 t0yv0

The advantage of ~/.pulumi is then we can do pulumi prune command to get rid of that. I don't know when we'd know automatically that it is a good time to delete these binaries; these are not really temp files we know when to delete. In the limit, it can leak temp disk space if we never delete nor provide any means to do so.

We wouldn't need anything special from the compiler here just:

go build -o ~/.pulumi/go-compiler-cache/$(hash-of-current-folder)/$(name-of-program)
~/.pulumi/go-compiler-cache/$(hash-of-current-folder)/$(name-of-program) # execute

It sounds like we're creating our own build system for Go. I can make tempdirs in $HOME/.pulumi/go/ for now as I'm more concerned with unblocking and delivering #10227. Maybe we can add caching in later?

dixler avatar Aug 15 '22 16:08 dixler

It sounds like we're creating our own build system for Go.

I don't think that's fair. The concern is disk space leak from retained compiled Go programs (especially on servers running autonomously). There does not seem to be a good point where Pulumi should delete the compiled file, correct? There are two approaches:

  • write to TEMP, periodically clean up TEMP
  • write to a Pulumi-managed place in the file system, add either clean this up as part of Pulumi commands or add a new command to clean it up
    • this could be a new folder like ~/.pulumi/go-compiler-cache
    • this could tie into ~/.pulumi/workspaces or ~/.pulumi/stacks, perhaps with clean up on pulumi stack destroy

I guess using write-and-forget to TEMP is the least work and is acceptable if we believe the OS periodically frees up space by deleting files in TEMP. If we take this route, we'd just need to point this out in CHANGELOG, so if users start experiencing issues with disk space leaks we can identify this behavior change as a source.

t0yv0 avatar Aug 15 '22 16:08 t0yv0

The advantage of ~/.pulumi is then we can do pulumi prune command to get rid of that. I don't know when we'd know automatically that it is a good time to delete these binaries; these are not really temp files we know when to delete. In the limit, it can leak temp disk space if we never delete nor provide any means to do so.

We wouldn't need anything special from the compiler here just:

go build -o ~/.pulumi/go-compiler-cache/$(hash-of-current-folder)/$(name-of-program)
~/.pulumi/go-compiler-cache/$(hash-of-current-folder)/$(name-of-program) # execute

I have an approach that doesn't hash the current folder and instead uses the name of the project as the binary name. This should avoid creating too many binaries as each project has 1 cache entry that gets reused.

~/.pulumi/go-compiler-cache/$(name-of-project) # go program binary path

dixler avatar Aug 22 '22 15:08 dixler

I have an approach that doesn't hash the current folder and instead uses the name of the project as the binary name. This should avoid creating too many binaries as each project has 1 cache entry that gets reused.

I think this is an interesting approach, but I would err on shipping just the simple write the TEMP still for the first pass at this. Once we're happy that "go build" vs "go run" isn't causing any issues, we can start having some smarts like this added as feature flags (PULUMI_GO_CACHE_BUILDS?), that eventually become the default.

Frassle avatar Aug 22 '22 15:08 Frassle

After discussing with @Frassle, scoped this down and moved https://github.com/pulumi/pulumi/issues/10473 out to go as a separate change. @dixler apologies for a lot of back-and-forth on the reqs here.

t0yv0 avatar Aug 23 '22 16:08 t0yv0

@AaronFriel has a comment requesting feature flag to opt-in (disabled by default). @AaronFriel can you elaborate why do you feel we need that? The risks with caching builds are scoped out now so the only risk is changes in exit code handling.

t0yv0 avatar Aug 23 '22 18:08 t0yv0

The risks with caching builds are scoped out now so the only risk is changes in exit code handling.

Do these changes introduce any change, operationally, compared to go run? I want to ensure that if we have heavy users of automation API who might be running pulumi up on web requests. If they have processes in place to address disk usage from temporary binaries, go module caching, or unknown unknowns, I would prefer we don't surprise their ops teams with a change in behavior and instead roll it out as opt-in at first, then as a new default.

Delaying turning on this perf change by a release or two while we get a customer to evaluate has a marginal downside (delaying a perf win), but it may mitigate a large risk of unknown unknowns causing user pain. I would feel comfortable knowing for sure that someone had tested this before rolling it out.

(FWIW: I feel the same about e.g.: changing our TypeScript compiler settings. I think there are large wins possible via swc, esbuild, etc., but I would want to make them all opt in by default, no matter how rigorously we test ourselves.)

AaronFriel avatar Aug 23 '22 21:08 AaronFriel