cli icon indicating copy to clipboard operation
cli copied to clipboard

Introduce support for CLI plugin hooks

Open laurazard opened this issue 2 years ago • 10 comments

- What I did

Introduce a "hooks" mechanism for CLI plugins, which allows plugins to declare interest in being invoked after the execution of a docker CLI command.

This works by adding having the plugins declare which commands they want to be invoked for in the plugin metadata, which the CLI checks after command executions to invoke the relevant plugins. If a plugin declares support for hooks, it MUST then implement a "hook" subcommand, which prints a message/template to it's stdout.

Using such a system, a plugin to help new docker users could, for example, declare intent to hook into docker build, and return a templated message instructing the user how to run their newly-built image.

A templating approach was chosen to allow plugin messages to include information from the original command execution (such as the tag used for the image in build), so that messages can provide more useful information (such as printing "Run your image with docker run my-image" after a docker build -t my-image .

Helpers are provided in the form of manager.TemplateReplaceFlagValue("tag") for this case.

The templating approach also keeps sensitive data from leaving the CLI process.

Example Plugin
func main() {
	plugin.Run(func(dockerCli command.Cli) *cobra.Command {
		cmd := RootCommand(dockerCli)
		originalPreRun := cmd.PersistentPreRunE
		cmd.PersistentPreRunE = func(cmd *cobra.Command, args []string) error {
			if err := plugin.PersistentPreRunE(cmd, args); err != nil {
				//nolint: wrapcheck
				return err
			}
			if originalPreRun != nil {
				return originalPreRun(cmd, args)
			}
			return nil
		}
		return cmd
	},
		manager.Metadata{
			SchemaVersion: "0.1.0",
			Vendor:        "Docker Inc.",
			Version:       "0.1",
			HookCommands:  []string{"build"},  // declare intent to hook into `build` subcommand invocations
		})
}

func RootCommand(dockerCli command.Cli) *cobra.Command {
	rootCmd := &cobra.Command{
		Short:            "Docker Hints",
		Use:              "hints",
		TraverseChildren: true,
	}

	rootCmd.AddCommand(hookCommand(dockerCli))
	return rootCmd
}

func hookCommand(dockerCli command.Cli) *cobra.Command {
	hookCmd := &cobra.Command{
		Use:   manager.HookSubcommandName,  // register hook command that the CLI will invoke
		Short: "runs the plugins hooks",
		RunE: utils.Adapt(func(ctx context.Context, args []string) error {
			runHooks(dockerCli.Out(), []byte(args[0]))
			return nil
		}),
		Args: cobra.ExactArgs(1),
	}

	return hookCmd
}

func runHooks(out io.Writer, input []byte) {
	var c manager.HookPluginData
	_ = json.Unmarshal(input, &c)

	hint := "Run this image with `docker run " + manager.TemplateReplaceFlagValue("tag") + "`" // template helper
	returnType := manager.HookMessage{
		Template: hint,
	}
	enc := json.NewEncoder(out)
	enc.SetEscapeHTML(false)
	enc.SetIndent("", "     ")
	_ = enc.Encode(returnType)
}

Which results in: image


Performance Callouts

The logic introduced in this PR to load/loop over plugins to invoke and collect their hook output only kicks in when the CLI binary is being invoked in an interactive terminal. As such, when invoking the CLI from a script, no performance impact is noticed:

⭑ hyperfine --prepare 'sudo sh -c "sync; echo 1 > /proc/sys/vm/drop_caches"' 'docker -v' './docker -v'
Benchmark 1: docker -v
  Time (mean ± σ):      19.3 ms ±   0.5 ms    [User: 4.9 ms, System: 13.2 ms]
  Range (min … max):    18.6 ms …  20.3 ms    12 runs

Benchmark 2: ./docker -v
  Time (mean ± σ):      18.9 ms ±   0.6 ms    [User: 4.7 ms, System: 11.5 ms]
  Range (min … max):    17.6 ms …  20.2 ms    40 runs

Summary
  ./docker -v ran
    1.02 ± 0.04 times faster than docker -v

However, this doesn't apply when running normal commands from a terminal:

⭑ hyperfine --prepare 'sudo sh -c "sync; echo 1 > /proc/sys/vm/drop_caches"' 'script --return --quiet -c "docker -v" /dev/null' 'script --return --quiet -c "./docker -v" /dev/null'
Benchmark 1: script --return --quiet -c "docker -v" /dev/null
  Time (mean ± σ):      34.2 ms ±   1.1 ms    [User: 4.6 ms, System: 15.3 ms]
  Range (min … max):    31.9 ms …  36.4 ms    29 runs

Benchmark 2: script --return --quiet -c "./docker -v" /dev/null
  Time (mean ± σ):      78.2 ms ±   9.3 ms    [User: 40.2 ms, System: 70.6 ms]
  Range (min … max):    72.2 ms … 118.3 ms    22 runs

Summary
  script --return --quiet -c "docker -v" /dev/null ran
    2.29 ± 0.28 times faster than script --return --quiet -c "./docker -v" /dev/null

In this scenario, on my machine I see an impact of, on average, ~44ms. The culprit here, is, mainly, the time it takes for the CLI to load/retrieve the metadata of all present CLI plugins, or more specifically, this will always be the same as the slowest CLI plugin. If I remove all other CLI plugins and re-run the benchmarks we get:

⭑ hyperfine --prepare 'sudo sh -c "sync; echo 1 > /proc/sys/vm/drop_caches"' 'script --return --quiet -c "docker -v" /dev/null' 'script --return --quiet -c "./docker -v" /dev/null'
Benchmark 1: script --return --quiet -c "docker -v" /dev/null
  Time (mean ± σ):      32.5 ms ±   1.2 ms    [User: 5.2 ms, System: 13.6 ms]
  Range (min … max):    30.1 ms …  35.1 ms    29 runs

Benchmark 2: script --return --quiet -c "./docker -v" /dev/null
  Time (mean ± σ):      42.9 ms ±   2.0 ms    [User: 7.0 ms, System: 18.9 ms]
  Range (min … max):    39.6 ms …  51.6 ms    31 runs

Summary
  script --return --quiet -c "docker -v" /dev/null ran
    1.32 ± 0.08 times faster than script --return --quiet -c "./docker -v" /dev/null

which is a more reasonable average of ~10ms.

In my case, the worst offenders were Compose and Buildx, which take around 35ms to load.

I thought this impact was significant enough to call out – whether we need to address it for an initial iteration or not is a separate question, since the scope of the impact is mitigated by the fact that it will only effect invocations from an interactive terminal, which means that in general, automated workflows where performance is more important won't be affected.

The simplest mitigation to this is caching plugin metadata on-disk, which should be fairly straightforward.

Security Callouts

The current approach works by letting plugins declare a list of strings which are the CLI commands (such as build, run, etc.) whose execution the plugin wants to be hooked into. The CLI then only invokes each plugin for the commands the plugin has declared support for.

Since a plugin might want to be invoked for more than one command, some information must be passed into the plugin when it is invoked so that it can decide what information it wants to add – which for right now, is codified into the HookPluginData type and contains only a string representing the command being executed. Richer output is made possible by employing a templating approach, wherein the plugin can return a template string with special values which the CLI then processes – e.g.

$ docker build -t my-image .

# a plugin returning
You can run the image you just built with `docker run {docker-cli-plugin-hook-flag-tag}`

# gets parsed into
You can run the image you just built with `docker run my-image`

This allows for plugins to provide useful/contextual messages, without passing any potentially sensitive information to any other binary.

Possible future developments to make this more secure are things such as always being explicit when a plugin is invoked, so that the user is aware, or prompting the user whether they are okay with a plugin hooking into a command execution on first execution.


- A picture of a cute animal (not mandatory but encouraged)

image

laurazard avatar Jun 26 '23 16:06 laurazard

Codecov Report

Merging #4376 (39b6c40) into master (ba64608) will increase coverage by 61.53%. Report is 47 commits behind head on master. The diff coverage is 31.15%.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #4376       +/-   ##
===========================================
+ Coverage        0   61.53%   +61.53%     
===========================================
  Files           0      292      +292     
  Lines           0    20378    +20378     
===========================================
+ Hits            0    12539    +12539     
- Misses          0     6945     +6945     
- Partials        0      894      +894     

codecov-commenter avatar Jul 22 '23 23:07 codecov-commenter

@thaJeztah @vvoland @krissetto @Benehiko rewrote the hook templates using Go Templates, PTAL if you get a sec :)

laurazard avatar Mar 12 '24 16:03 laurazard

~ $ time /tmp/docker-pr4376 version

real    0m0.162s
user    0m0.183s
sys     0m0.102s
~ $ time /tmp/docker-pr4376 version

real    0m0.209s
user    0m0.164s
sys     0m0.086s
~ $ time /tmp/docker-pr4376 version

real    0m0.145s
user    0m0.188s
sys     0m0.060s

~ $ time /tmp/docker-pr4376 version
real    0m0.153s
user    0m0.187s
sys     0m0.066s

~ $ time /tmp/docker-26.0.0-rc2 version

real    0m0.019s
user    0m0.008s
sys     0m0.006s
~ $ time /tmp/docker-26.0.0-rc2 version

real    0m0.021s
user    0m0.010s
sys     0m0.005s

~ $ time /tmp/docker-26.0.0-rc2 version

real    0m0.019s
user    0m0.002s
sys     0m0.013s

~ $ time /tmp/docker-26.0.0-rc2 version

real    0m0.020s
user    0m0.006s
sys     0m0.009s


 $ time /tmp/docker-26.0.0-rc2 version

real    0m0.284s
user    0m0.014s
sys     0m0.010s
 $ sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches'
 $ time /tmp/docker-26.0.0-rc2 version

real    0m0.318s
user    0m0.009s
sys     0m0.013s


 $ sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches'
 $ time /tmp/docker-pr4376 version

real    0m8.053s
user    0m0.315s
sys     0m0.282s
 $ sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches'
 $ time /tmp/docker-pr4376 version

real    0m8.151s
user    0m0.237s
sys     0m0.320s

Note that hyperfine does not show correct numbers as this feature is disabled in non-terminal in the current PR.

This was with the plugins bundle that ships with Docker Desktop (9) as it clearly depends on the number of plugins in the discovery directories.

tonistiigi avatar Mar 12 '24 19:03 tonistiigi

I called that out in the performance section of the PR description, and took steps to avoid that (running with a fake terminal), no?

Anyway, I'm looking from my phone but significant deviation from the numbers in the PR description, it's probably due to me pushing a version using Go Templates vs the previous regex approach. I meant to post a comment showing the comparison after I got back to my keyboard.

Edit: if just testing docker -v then I guess the difference would just be the plugins loaded, probably more than what I had at the time of testing. I called that out in the section iirc, for me buildx and compose were the worst.

Edit2: is that 8 seconds? What the hell 😩

laurazard avatar Mar 12 '24 19:03 laurazard

it's probably due to me pushing a version using Go Templates vs the previous regex approach.

I haven't debugged this but that seems unlikely. I do not have any actual hooks configured for anything so I guess there is no parsing. I'm just running empty defaults (where I think we should not have performance regressions). Some CPU overhead should be negligible compared to overhead running each plugin process for every command. The plugin itself should not matter more than that bigger binaries take more time to load to memory, before they can respond that they don't implement hooks.

is that 8 seconds? What the hell 😩

This is when I clear the cache to mimic initial run. OS would likely have the plugin binaries already in memory for the other repeated tests.

tonistiigi avatar Mar 12 '24 20:03 tonistiigi

Alright, I'm back at a keyboard now:

Note that hyperfine does not show correct numbers as this feature is disabled in non-terminal in the current PR.

Right, that's why in the PR description I called used script to fake a terminal. I specifically called this out:

The logic introduced in this PR to load/loop over plugins to invoke and collect their hook output only kicks in when the CLI binary is being invoked in an interactive terminal. As such, when invoking the CLI from a script, no performance impact is noticed:

⭑ hyperfine --prepare 'sudo sh -c "sync; echo 1 > /proc/sys/vm/drop_caches"' 'docker -v' './docker -v'
Benchmark 1: docker -v
  Time (mean ± σ):      19.3 ms ±   0.5 ms    [User: 4.9 ms, System: 13.2 ms]
  Range (min … max):    18.6 ms …  20.3 ms    12 runs

Benchmark 2: ./docker -v
  Time (mean ± σ):      18.9 ms ±   0.6 ms    [User: 4.7 ms, System: 11.5 ms]
  Range (min … max):    17.6 ms …  20.2 ms    40 runs

Summary
  ./docker -v ran
    1.02 ± 0.04 times faster than docker -v

However, this doesn't apply when running normal commands from a terminal:

⭑ hyperfine --prepare 'sudo sh -c "sync; echo 1 > /proc/sys/vm/drop_caches"' 'script --return --quiet -c "docker -v" /dev/null
' 'script --return --quiet -c "./docker -v" /dev/null'
Benchmark 1: script --return --quiet -c "docker -v" /dev/null
  Time (mean ± σ):      34.2 ms ±   1.1 ms    [User: 4.6 ms, System: 15.3 ms]
  Range (min … max):    31.9 ms …  36.4 ms    29 runs

Benchmark 2: script --return --quiet -c "./docker -v" /dev/null
  Time (mean ± σ):      78.2 ms ±   9.3 ms    [User: 40.2 ms, System: 70.6 ms]
  Range (min … max):    72.2 ms … 118.3 ms    22 runs

Summary
  script --return --quiet -c "docker -v" /dev/null ran
    2.29 ± 0.28 times faster than script --return --quiet -c "./docker -v" /dev/null

In this scenario, on my machine I see an impact of, on average, ~44ms. The culprit here, is, mainly, the time it takes for the CLI to load/retrieve the metadata of all present CLI plugins, or more specifically, this will always be the same as the slowest CLI plugin. If I remove all other CLI plugins and re-run the benchmarks we get:

⭑ hyperfine --prepare 'sudo sh -c "sync; echo 1 > /proc/sys/vm/drop_caches"' 'script --return --quiet -c "docker -v" /dev/null' 'script --return --quiet -c "./docker -v" /dev/null'
Benchmark 1: script --return --quiet -c "docker -v" /dev/null
  Time (mean ± σ):      32.5 ms ±   1.2 ms    [User: 5.2 ms, System: 13.6 ms]
  Range (min … max):    30.1 ms …  35.1 ms    29 runs

Benchmark 2: script --return --quiet -c "./docker -v" /dev/null
  Time (mean ± σ):      42.9 ms ±   2.0 ms    [User: 7.0 ms, System: 18.9 ms]
  Range (min … max):    39.6 ms …  51.6 ms    31 runs

Summary
  script --return --quiet -c "docker -v" /dev/null ran
    1.32 ± 0.08 times faster than script --return --quiet -c "./docker -v" /dev/null
which is a more reasonable average of ~10ms.

In my case, the worst offenders were Compose and Buildx, which take around 35ms to load.

Still, I can't replicate the 8 seconds that you ran into. In my environment:

 ⭑ time ./build/docker version
./build/docker version  0.06s user 0.12s system 126% cpu 0.141 total

⭑ time ./build/docker version
./build/docker version  0.07s user 0.06s system 188% cpu 0.067 total

 ⭑ sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches'
⭑ time ./build/docker version

./build/docker version  0.06s user 0.17s system 119% cpu 0.194 total

It's a bit slower to be sure, but nowhere near your case.

laurazard avatar Mar 12 '24 21:03 laurazard

script --return --quiet -c "docker -v" /dev/null ran

docker -v does not call into this code-path. Your latest comment uses docker version that does.

Still, I can't replicate the 8 seconds that you ran into. In my environment:

This is from codespaces and can be replicated there. Just make sure to install desktop to get the plugins bundle as only compose&buildx are installed by default.

8 seconds is indeed unexpected and it doesn't fit linear scaling from plugins count(considering rc2 baseline takes 300ms on same mode), but I think any double-digit percentage (=measurable) performance regression should be avoided.

I added some debug on that drop_cache run. Clearly still coming from that function. I think previously I had one extra plugin leftover and that causes the 8s -> 6s difference.

03:41:47.268036 > RunPluginHooks
03:41:47.284216 < listPluginCandidates done 
03:41:47.284630 > newPlugin /usr/lib/docker/cli-plugins/docker-scout
03:41:47.284628 > newPlugin /usr/lib/docker/cli-plugins/docker-extension
03:41:47.284761 > newPlugin /usr/lib/docker/cli-plugins/docker-feedback
03:41:47.284873 > newPlugin /usr/lib/docker/cli-plugins/docker-init
03:41:47.284909 > newPlugin /usr/lib/docker/cli-plugins/docker-compose
03:41:47.284935 > newPlugin /usr/lib/docker/cli-plugins/docker-dev
03:41:47.285002 > newPlugin /usr/lib/docker/cli-plugins/docker-sbom
03:41:47.285086 > newPlugin /usr/lib/docker/cli-plugins/docker-buildx
03:41:47.285099 > newPlugin /usr/lib/docker/cli-plugins/docker-debug
03:41:51.179446 < newPlugin /usr/lib/docker/cli-plugins/docker-feedback done
03:41:51.300104 < newPlugin /usr/lib/docker/cli-plugins/docker-dev done
03:41:51.357775 < newPlugin /usr/lib/docker/cli-plugins/docker-init done
03:41:51.401657 < newPlugin /usr/lib/docker/cli-plugins/docker-extension done
03:41:51.820787 < newPlugin /usr/lib/docker/cli-plugins/docker-debug done
03:41:52.002857 < newPlugin /usr/lib/docker/cli-plugins/docker-sbom done
03:41:52.955011 < newPlugin /usr/lib/docker/cli-plugins/docker-buildx done
03:41:52.959200 < newPlugin /usr/lib/docker/cli-plugins/docker-compose done
03:41:52.963600 < newPlugin /usr/lib/docker/cli-plugins/docker-scout done
03:41:52.963630 > ListPlugins done
03:41:52.963644 < RunPluginHooks 5.695607869s

real    0m6.049s
user    0m0.214s
sys     0m0.222s

tonistiigi avatar Mar 13 '24 03:03 tonistiigi

Replying here to provide a coherent response to the concerns brought up by @tonistiigi:

To properly test performance, I made the following changes to cmd/docker/docker.go:

diff --git a/cmd/docker/docker.go b/cmd/docker/docker.go
index 431b1e027..ca9142151 100644
--- a/cmd/docker/docker.go
+++ b/cmd/docker/docker.go
@@ -306,7 +306,7 @@ func runDocker(dockerCli *command.DockerCli) error {
                }
        }

-       var subCommand *cobra.Command
+       // var subCommand *cobra.Command
        if len(args) > 0 {
                ccmd, _, err := cmd.Find(args)
                if err != nil || pluginmanager.IsPluginCommand(ccmd) {
@@ -318,7 +318,7 @@ func runDocker(dockerCli *command.DockerCli) error {
                        // cmd.Execute() which deals with reporting
                        // "command not found" in a consistent way.
                } else {
-                       subCommand = ccmd
+                       // subCommand = ccmd
                }
        }

@@ -332,9 +332,9 @@ func runDocker(dockerCli *command.DockerCli) error {

        // If the command is being executed in an interactive terminal,
        // run the plugin hooks (but don't throw an error if something misbehaves)
-       if dockerCli.Out().IsTerminal() && subCommand != nil {
-               _ = pluginmanager.RunPluginHooks(dockerCli, cmd, subCommand)
-       }
+       // if dockerCli.Out().IsTerminal() && subCommand != nil {
         _ = pluginmanager.RunPluginHooks(dockerCli, cmd, subCommand)
+       // }

        return nil
 }

And built a version, with all the same code, with and without hooks. After that, I benchmarked with:

hyperfine --prepare 'sudo sh -c "sync; echo 1 > /proc/sys/vm/drop_caches"' './build/docker version' './build/docker-with-hooks version'

So, lets test this:

This was with the plugins bundle that ships with Docker Desktop (9) as it clearly depends on the number of plugins in the discovery directories.

We'll test with a number of different plugins:

  buildx: Docker Buildx (Docker Inc.)
  compose: Docker Compose (Docker Inc.)
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
  hints: Docker Hints (Docker Inc.)
  long: Docker Long Running Example Plugin (Docker Inc.)

Benchmark 1: ./build/docker version
  Time (mean ± σ):      28.9 ms ±   7.2 ms    [User: 3.5 ms, System: 11.9 ms]
  Range (min … max):    25.5 ms …  61.1 ms    23 runs

Benchmark 2: ./build/docker-with-hooks version
  Time (mean ± σ):      66.8 ms ±   3.6 ms    [User: 32.8 ms, System: 65.8 ms]
  Range (min … max):    64.1 ms …  78.1 ms    28 runs

Summary
  ./build/docker version ran
    2.31 ± 0.59 times faster than ./build/docker-with-hooks version

So that's ~40ms difference. With more plugins:

  buildx: Docker Buildx (Docker Inc.)
  compose: Docker Compose (Docker Inc.)
  dev: Docker Dev Environments (Docker Inc.)
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
  hints: Docker Hints (Docker Inc.)
  init: Creates Docker-related starter files for your project (Docker Inc.)
  long: Docker Long Running Example Plugin (Docker Inc.)


Benchmark 1: ./build/docker version
  Time (mean ± σ):      30.0 ms ±   5.8 ms    [User: 4.4 ms, System: 11.7 ms]
  Range (min … max):    24.8 ms …  52.3 ms    21 runs

Benchmark 2: ./build/docker-with-hooks version
  Time (mean ± σ):      71.9 ms ±   8.1 ms    [User: 38.3 ms, System: 83.8 ms]
  Range (min … max):    66.6 ms … 103.4 ms    22 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (103.4 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--prepare' option which can be used to clear caches. If you did not use a cache-clearing command with '--prepare', you can either try that or consider using the '--warmup' option to fill those caches before the actual benchmark.

Summary
  ./build/docker version ran
    2.40 ± 0.54 times faster than ./build/docker-with-hooks version

Huh that's about 40ms too. Lets try more:

  buildx: Docker Buildx (Docker Inc.)
  compose: Docker Compose (Docker Inc.)
  dev: Docker Dev Environments (Docker Inc.)
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
  hints: Docker Hints (Docker Inc.)
  init: Creates Docker-related starter files for your project (Docker Inc.)
  long: Docker Long Running Example Plugin (Docker Inc.)
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)


Benchmark 1: ./build/docker version
  Time (mean ± σ):      27.1 ms ±   1.8 ms    [User: 4.1 ms, System: 10.9 ms]
  Range (min … max):    25.2 ms …  35.7 ms    30 runs

Benchmark 2: ./build/docker-with-hooks version
  Time (mean ± σ):      71.9 ms ±   4.2 ms    [User: 45.8 ms, System: 90.9 ms]
  Range (min … max):    66.4 ms …  85.0 ms    31 runs

Summary
  ./build/docker version ran
    2.66 ± 0.24 times faster than ./build/docker-with-hooks version

It's still about the same. It seems like the performance penalty does not depend on the number of plugins, as we thought (from the original PR text):

or more specifically, this will always be the same as the slowest CLI plugin.

Adding scout:

  buildx: Docker Buildx (Docker Inc.)
  compose: Docker Compose (Docker Inc.)
  debug: Get a shell into any image or container. (Docker Inc.)
  dev: Docker Dev Environments (Docker Inc.)
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
  hints: Docker Hints (Docker Inc.)
  init: Creates Docker-related starter files for your project (Docker Inc.)
  long: Docker Long Running Example Plugin (Docker Inc.)
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
  scout: Docker Scout (Docker Inc.)


Benchmark 1: ./build/docker version
  Time (mean ± σ):      27.3 ms ±   1.9 ms    [User: 3.8 ms, System: 11.0 ms]
  Range (min … max):    25.5 ms …  36.0 ms    26 runs

Benchmark 2: ./build/docker-with-hooks version
  Time (mean ± σ):      88.6 ms ±  10.6 ms    [User: 72.4 ms, System: 137.5 ms]
  Range (min … max):    81.0 ms … 134.6 ms    23 runs

Summary
  ./build/docker version ran
    3.25 ± 0.45 times faster than ./build/docker-with-hooks version

That's 60ms, 20ms slower than without. I guess that's because

~/.docker/cli-plugins/docker-scout docker-cli-plugin-metadata  0.05s user 0.01s system 116% cpu 0.058 total

whereas other plugins are a good bit faster

~/.docker/cli-plugins/docker-buildx  0.03s user 0.01s system 104% cpu 0.039 total
~/.docker/cli-plugins/docker-compose  0.02s user 0.02s system 110% cpu 0.037 total
~/.docker/cli-plugins/docker-sbom  0.01s user 0.02s system 121% cpu 0.022 total

This checks out with what we expect.

Currently, the performance penalty is as we expect, barring any strange edge-cases. As called out in the PR description, a mitigation (that we're planning on working on after merging this) for this issue is to cache plugin metadata so that we don't always have to execute every plugin to see if they support hooks.

I'll also be adding a very naive timeout to the plugin scanning logic, to make sure we cap that out and never let it go longer than 80 or 100ms

laurazard avatar Mar 18 '24 18:03 laurazard

It's still about the same. It seems like the performance penalty does not depend on the number of plugins, as we thought (from the original PR text):

You can see in your output how user/system gradually grows with every added plugin, spinning the CPU more each time. In the last example, the resource usage is 14x of baseline, 9x before scout is added.

As these plugins all run in their own goroutine, the real time depends on cpu count, OS's exec implementation(including malware detection mechanisms), prewarmed cache, system load, and other factors.

Docker is run by millions of users, with various os/arch/perf configurations. Afaik Desktop does not come with a free 10+ core machine. If you compare absolute numbers then obviously users with low perf and low cpu count would see the impact more (and CLI is released to some really low-powered machines) than machines with a lot of resources to spare. To compare the absolute numbers, you would need a big test matrix of all the OS(maybe even kernels & infra providers), different ranges of cpu counts and perf generations, memory pressure, invocation frequency etc. But we don't need to compare the absolute numbers, just that previously we needed resources to run 1 binary and now we need resources to run 10 (and more with any future update or extra plugin the user has added). Unless you can defy some laws of physics, the latter will be slower for the users.

Adding scout:

Scout binary is more than 4x bigger than dev/debug/init/feedback etc small plugins. Minimally, the extra resources need to be spent to load these binaries into memory, but it is not impossible that there is some init() rule that can be optimized as well(and other bottlenecks for spawning binaries in parallel are likely OS implementation-specific). Compared to buildx, what is only slightly smaller, scout's metadata cmd takes 25ms vs 15ms. How much impact a bigger binary or init() function has when loading a big bundle of binaries may not be universal in different configurations. This does show another weakness of this approach though - should any other binary make a mistake or just be unnecessarily bloated, it slows down the whole docker experience for any command.

Currently, the performance penalty is as we expect,

That's another position where we differ. To me, any measurable performance regression from spinning the CPU needlessly should be avoided. Comparing this to other historic regressions I've either fixed or reported in the past, I think this is the biggest one yet (at least if the high multi-core machines are excluded).

tonistiigi avatar Mar 19 '24 17:03 tonistiigi

This (current) performance discussion is moot as we'll be making some implementation changes that allow us to not have to scan/execute all the plugins, but for discussion's sake:

You can see in your output how user/system gradually grows with every added plugin

Yes, obviously cpu time scales with # plugins. I'd expected that we were discussing real time before, esp. since we're talking about the Desktop plugin bundles that most people would be running on their main development computers, and not in CI/etc. Otherwise if we're discussing CI environments why are you testing with more than 2 plugins? We can discuss either case, but discussing the worst case scenario of "extremely restricted environment but for some reason with Docker Desktop + all the plugins bundled installed" doesn't seem very productive.

Docker is run by millions of users, with various os/arch/perf configurations. Afaik Desktop does not come with a free 10+ core machine. If you compare absolute numbers then obviously users with low perf and low cpu count would see the impact more (and CLI is released to some really low-powered machines) than machines with a lot of resources to spare.

Yes, it is. That still does not mean that we cannot incur a performance hit ever, especially when this is done as a step in the direction of overall improving performance.

This does show another weakness of this approach though - should any other binary make a mistake or just be unnecessarily bloated, it slows down the whole docker experience for any command.

Maybe I wasn't clear in my previous reply, but this sentence – I'll also be adding a very naive timeout to the plugin scanning logic, to make sure we cap that out and never let it go longer than 80 or 100ms – refers to the plugin execution time as well.

To me, any measurable performance regression from spinning the CPU needlessly should be avoided

I agree. Which is why we're implementing this – so we can remove the wrapper and the "needless cpu spinning" that it does.


Regardless, I'll amend the PR soon – I was discussing with @thaJeztah and we were thinking about instead storing whether a plugin supports hooks in the CLI's config file, which would negate the need to cycle over/execute all the plugins to check for support.

laurazard avatar Mar 19 '24 17:03 laurazard

Pushed a change reworking the plugin hook listing+scanning logic – instead of scanning the plugin folders and executing every candidate to check their metadata for hook support, we now expect plugins with hooks to be configured in the CLI's config.json. For example:

{
  "auths": {},
  "plugins": {
    "hints": {
      "hooks": "pull,build"
    }
  }
}

The new logic looks only for plugins configured this way, which significantly reduces execution time (and makes it so that performance does not get worse with the amount of plugins bundled).

Applying the above patch (executes plugins even if not attached to a tty):

Benchmark 1: ./build/docker version
  Time (mean ± σ):       9.3 ms ±   1.0 ms    [User: 3.7 ms, System: 3.1 ms]
  Range (min … max):     6.8 ms …  13.7 ms    211 runs

Benchmark 2: ./build/docker-without-hooks version
  Time (mean ± σ):       9.4 ms ±   2.0 ms    [User: 3.6 ms, System: 3.2 ms]
  Range (min … max):     6.8 ms …  40.6 ms    288 runs

Summary
  ./build/docker version ran
    1.01 ± 0.24 times faster than ./build/docker-without-hooks version

That is for docker version, which no plugins register hook from. Measuring w/ a plugin enabled – with the example plugin, and instrumenting RunPluginHooks to time it's execution:

./build/docker pull alpine
Using default tag: latest
latest: Pulling from library/alpine
Digest: sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
Status: Image is up to date for alpine:latest
docker.io/library/alpine:latest

Next steps:
    - Run this image with `docker run alpine`
RunPluginHooks took 12.409638ms
...
RunPluginHooks took 8.642533ms
...
RunPluginHooks took 14.853688ms

from end to end, including calling the hook plugin, which is a lot more reasonable.

Still want to do a quick run through of some codebases to check no plugin is going to break if we do this.

laurazard avatar Mar 20 '24 02:03 laurazard