gradle-docker-plugin icon indicating copy to clipboard operation
gradle-docker-plugin copied to clipboard

Slow DockerBuildImage task due to gradle construction of fileHashes.bin

Open mcayland opened this issue 5 years ago • 4 comments

I've been working on a project to move our CI system over to k8s and have started to experience really slow build times using a DockerBuildImage task with gradle-docker-plugin:6.4.0 and gradle 5.6.2.

My setup is similar to https://github.com/docker-java/docker-java/issues/1409 in that I have a single git repository which contains a handful of projects (most of them frontend projects in node) which looks like this:

projectDir
    - .dockerignore (excludes node_modules)
    - docker
        - Dockerfile
    - node-app-1
        - node_modules
        - app1.js

    - node-app-2
        - node_modules
        - app2.js

    ....

    - node-app-9
        - node_modules
        - app9.js

The Dockerfile simply copies out the node build artifacts from each app subdirectory and copies the result into a the final distributable image. The process is driven by a simple gradle project using a DockerBuildImage task that looks like this:

task buildProd (type: com.bmuschko.gradle.docker.tasks.image.DockerBuildImage) {
    inputDir = project.file("${projectDir}")
    dockerFile = file("docker/Dockerfile")

    images.add("myimage:latest")
}

The problem I find is that the DockerBuildImage task takes a huge amount of time to run compared to running docker build on its own:

  • Docker Build: 4s
  • DockerBuildImage Gradle task: 6m 20s (!)

After doing some digging I discovered that the majority of time appears to be spent creating .gradle/5.6.2/fileHashes/fileHashes.bin which comes to about 4MB for that particular image. Re-running the DockerBuildImage task from the same container (rather than a fresh builder image) reduces the DockerBuildImage task to a much more reasonable 29s.

Reading the gradle documentation at https://docs.gradle.org/current/userguide/more_about_tasks.html#sec:up_to_date_checks, I think the problem is that DockerBuildImages's inputDir property is annotated with @InputDirectory at https://github.com/bmuschko/gradle-docker-plugin/blob/master/src/main/groovy/com/bmuschko/gradle/docker/tasks/image/DockerBuildImage.groovy#L48 which registers the entire directory tree as task inputs, and hence triggers gradle to perform a several minute scan of the entire directory tree containing 10s of 1000s of files to generate fileHashes.bin before starting the build. Unfortunately since this is done directly by gradle it is not possible to eliminate directories that we are not interested in via .dockerignore first.

Given that inputDir has been designed to mark the entire build context as gradle task inputs, I'm wondering if the best solution is to introduce a new contextDir property which is basically the same as inputDir but with the @InputDirectory annotation removed? It is then possible to set contextDir to the base of the project which copies the existing docker build behaviour and in this case should substantially reduce the DockerBuildImage time.

mcayland avatar Sep 04 '20 06:09 mcayland

Following up on this theory I tried a test on my local workstation with the following quick and dirty patch:

diff --git a/src/main/groovy/com/bmuschko/gradle/docker/tasks/image/DockerBuildImage.groovy b/src/main/groovy/com/bmuschko/gradle/docker/tasks/image/DockerBuildImage.groovy
index 2a4a6d62..17cff9d4 100644
--- a/src/main/groovy/com/bmuschko/gradle/docker/tasks/image/DockerBuildImage.groovy
+++ b/src/main/groovy/com/bmuschko/gradle/docker/tasks/image/DockerBuildImage.groovy
@@ -49,9 +49,17 @@ class DockerBuildImage extends AbstractDockerRemoteApiTask implements RegistryCr
      * Input directory containing the build context. Defaults to "$buildDir/docker".
      */
     @InputDirectory
+    @Optional
     @PathSensitive(PathSensitivity.RELATIVE)
     final DirectoryProperty inputDir = project.objects.directoryProperty()
 
+    /**
+     * Directory containing the build context (without adding contents as task inputs)
+     */
+    @Input
+    @Optional
+    final Property<File> contextDir = project.objects.property(File)
+
     /**
      * The Dockerfile to use to build the image.  If null, will use 'Dockerfile' in the
      * build context, i.e. "$inputDir/Dockerfile".
@@ -196,7 +204,9 @@ class DockerBuildImage extends AbstractDockerRemoteApiTask implements RegistryCr
     final Property<String> imageId = project.objects.property(String)
 
     DockerBuildImage() {
-        inputDir.set(project.layout.buildDirectory.dir('docker'))
+        //inputDir.set(project.layout.buildDirectory.dir('docker'))
+        // FIXME: when inputDir not set this throws "Directory specified for property 'inputDir' does not exist"
+        inputDir.set(project.layout.buildDirectory)
         images.empty()
         noCache.set(false)
         remove.set(false)
@@ -225,16 +235,24 @@ class DockerBuildImage extends AbstractDockerRemoteApiTask implements RegistryCr
 
     @Override
     void runRemoteCommand() {
-        logger.quiet "Building image using context '${inputDir.get().asFile}'."
         BuildImageCmd buildImageCmd
+        File baseDir
+
+        if (contextDir.getOrNull()) {
+            baseDir = contextDir.get()
+        } else {
+            baseDir = inputDir.get().asFile
+        }
+
+        logger.quiet "Building image using context '${baseDir}'."
 
         if (dockerFile.getOrNull()) {
             logger.quiet "Using Dockerfile '${dockerFile.get().asFile}'"
             buildImageCmd = dockerClient.buildImageCmd()
-                    .withBaseDirectory(inputDir.get().asFile)
+                    .withBaseDirectory(baseDir)
                     .withDockerfile(dockerFile.get().asFile)
         } else {
-            buildImageCmd = dockerClient.buildImageCmd(inputDir.get().asFile)
+            buildImageCmd = dockerClient.buildImageCmd(baseDir)
         }
 
         if (images.getOrNull()) {

and in my build.gradle:

task buildProd (type: com.bmuschko.gradle.docker.tasks.image.DockerBuildImage) {
    contextDir = project.file("${projectDir}")
    dockerFile = file("docker/Dockerfile")

    images.add("myimage:latest")
}

Whilst my local workstation is a much higher spec compared to the server above (faster CPU, more RAM, SSD etc.) I was still able to see a significant speed increase with this change:

  • Docker Build: 2s
  • Vanilla DockerBuildImage Gradle task: 20s
  • Patched DockerBuildImage Gradle task: 5s

mcayland avatar Sep 04 '20 16:09 mcayland

Could you please put together a pull request if you'd like to see this change?

bmuschko avatar Nov 07 '20 04:11 bmuschko

Of course - I just wanted to make sure that you were happy with the analysis (and suggested fallback behaviour) first before spending the time on it.

I'm fairly confident this is the main reason people have noted performance issues with the plugin when compared with executing docker build directly.

mcayland avatar Nov 07 '20 11:11 mcayland

Honestly, I didn't have a look at the proposed changes very deeply yet. It's going to be easier to review as a PR.

bmuschko avatar Nov 07 '20 15:11 bmuschko