box Optimising rebuild conditions beyond layer cache

Layer cache is great, but at times it's not smart enough or non-transparent in how it decided to rebuild something and there is only certain depth we can go into and sometimes it's much easier to use a tagging scheme using git revision and check for image SHA1 being present locally and make decisions about whether build should run at all or not.

I have a project where Makefile looks like this:

IMAGE_TAG := $(shell ./tools/image-tag)
IMAGE_NAME := quay.io/weaveworks/launch-generator
UPTODATE := .uptodate/$(IMAGE_TAG)

.PHONY: $(UPTODATE)

image: build

## When container is built, we ouput the SHA1 of the image to a `.uptodate/$(IMAGE_TAG)`
$(UPTODATE):
	docker image build --tag=$(IMAGE_NAME) --tag=$(IMAGE_NAME):$(IMAGE_TAG) --build-arg=version_tag=$(IMAGE_TAG) .
	docker image inspect -f '{{.Id}}' $(IMAGE_NAME):$(IMAGE_TAG) > $@

## We need to quickly decide if docker build needs to run at all or not
## - always build if `$(UPTODATE)` file is missing
## - always build if `$(IMAGE_TAG)` ends with `-WIP`, i.e. there are uncommited changes
## - otherwise:
##   - if `$(IMAGE_NAME):$(IMAGE_TAG)` exists already
##     - if `$(UPTODATE)` file exists, then check contents agains SHA1 of `$(IMAGE_NAME):$(IMAGE_TAG)`
##     - else run the build, cause image can be stale with respect to the source tree
##   - else run the build anyway
build: Makefile Dockerfile src/*.js package.json
	@mkdir -p $(dir $(UPTODATE))
	@if ! [ -e $(UPTODATE) ] ; then \
	  $(MAKE) $(UPTODATE) ; \
	else \
	  if echo "$(IMAGE_NAME):$(IMAGE_TAG)" | grep -q '.*-WIP' ; then \
	    $(MAKE) $(UPTODATE) ; \
	  else \
	    if [ $$(docker image ls -q $(IMAGE_NAME):$(IMAGE_TAG) | wc -l) -eq 1 ] ; then \
	      if [ -e $(UPTODATE) ] ; then \
	        if ! [ $$(docker image inspect -f '{{.Id}}' $(IMAGE_NAME):$(IMAGE_TAG)) = $$(cat $(UPTODATE)) ] ; then \
	          $(MAKE) $(UPTODATE) ; \
	        fi \
	      else \
	        $(MAKE) $(UPTODATE) ; \
	      fi \
	    else \
	      $(MAKE) $(UPTODATE) ; \
	    fi \
	  fi \
	fi

local: image
	./run-locally.sh $(IMAGE_NAME):$(IMAGE_TAG)

test: image run-unit-tests.sh run-integration-tests.sh .jshintrc
	./run-unit-tests.sh $(IMAGE_NAME):$(IMAGE_TAG)
	./run-integration-tests.sh $(IMAGE_NAME):$(IMAGE_TAG)

clean:
	rm -r -f .uptodate
	docker image ls -q $(IMAGE_NAME) | sort | uniq | xargs docker image rm -f

And rather very simple Dockerfile:

FROM node:6-onbuild

ARG version_tag

ENV VERSION_TAG=${version_tag}

EXPOSE 8080

This Makefile is not amazing, but it is able to decide rather very quickly whether build needs to run at all or not. I could probably convert this to use box, but I'm not 100% sure how and to what extend I'd be able to simplify the Makefile. It's possible that some of this can already be done with box, but there may be a need for helpers that would allow either running local commands or being able to reference image SHA1's or may be I'm missing the point entierly?

Apr 04 '17 16:04 errordeveloper

what if #202 returned a reference which you could later use with a compose dsl keyword which accepted a list of layer ids? There would also be a function to query the existing image store for layer ids.

I've been mulling over how to do image rebase in the DSL and this is what I have so far. Seems like paths are converging around this functionality.

Apr 04 '17 17:04 erikh

ref1 = layer { run "ls" }
ref2 = layer { run "apt-get update" }
imgrefs = getrefs("golang:latest")

compose [*imgrefs, ref1, ref2]

Apr 04 '17 17:04 erikh

ok I re-read this and I think I see what you're getting at, although I'm not certain how to accomplish it. You want to seed the cache after using from to initialize it?

overmount gives us a lot more access to the cache's contents so perhaps we can do something once that has deeper integration. Maybe a re-targetable from?

Apr 12 '17 15:04 erikh