build Debugging a build description is painful

This is more of a feature request.

When I work on a build description, I find myself doing those steps over and over again:

kubectl delete -f build.yaml --ignore-not-found=true
Do the usual kung-fu to read the logs
Realize there's an error. Not a syntax error but something I forgot or didn't understand when writing the yaml descriptor.
Change something
kubectl delete -f build.yaml
Go to 1

What would be super nice is being able to run a build "locally" with only Docker installed, using some kind of wrapper that reproduces the logic of the build CRD without all the complexity added by Kubernetes. No steps, no initialization containers, direct printing of the logs on stdout...

Something exactly like https://github.com/GoogleCloudPlatform/container-builder-local

Most CI/CD systems fail short of providing a local history allowing faster debugging. And by "local", I don't mean using minikube of D4D. Those are not local. Those are remote clusters that are not too far away.

May 28 '18 14:05 dgageot

Having access to the workspace locally while the build is running and after the build is running is a huge help too

May 28 '18 14:05 dgageot

I think this bug might be conflating two things:

a bug: "descriptor errors are hard to debug"
a feature request: "I want a local off-cluster executor"

For the first, #8 should help a bit, since we can reject invalid build configs with custom hand-written error messages. If you have examples of specific build descriptor errors that were frustrating to debug, let me know and we'll make sure they're better in the new validation scheme.

For the second, I don't think we plan to provide a Build CRD executor that doesn't run on and take advantage of Kubernetes, because that would require its own custom implementation separate from Kubernetes, inevitably with its own divergent set of bugs, which defeats the whole point. In the case of container-builder-local, the code in that package is literally the same code that runs on the GCB worker VM, so the implementation doesn't diverge (...mostly...).

One strong option would be to integrate with Skaffold so that you can quickly iterate on your build definitions, with source from your own local workspace. That would definitely give a tighter feedback loop than having to push to Git or GCS, and since the Build is executing on a Kubernetes cluster (incl. Minikube), you've got the same underlying execution layer.

May 28 '18 14:05 imjasonh

AFAIK Skaffold doesn't have a facility to persist the cluster's workspace after the build executes (or fails), so maybe it only solves half of the problem. I'm sure we could come up with something though, if it's a significant problem.

May 28 '18 14:05 imjasonh

How is that different from reimplementing the whole logic? Skaffold would have to do that, right?

May 28 '18 14:05 dgageot

Perhaps I have an incomplete understanding of Skaffold's model, but I had assumed it would execute the build on the cluster (local or remote) that it uses the execute the rest of its workload.

I believe this is how CBI (another CRD for describing container builds) proposes to integrate with Skaffold. Please correct me if I'm wrong.

May 28 '18 14:05 imjasonh

@ImJasonH You are right, Skaffold could launch a patched version of the build using kubectl apply

This would:

Remove the need to push to git or GCS. (How do we pass the sources to the build then?)
Could do the log-fu for us. \o/
Could do the cleaning of the build

It's clearly missing the retrieval of the workspace. How would I do it right now? Can I access the workspace after the pod is complete/failed?

May 28 '18 15:05 dgageot

It's not possible today, but I could imagine some changes that would enable it. Maybe some option to specify the volume that backs /workspace, instead of an emptyDir like it is today? Then you could specify a persistent volume that you can inspect afterwards.

I think the Skaffold approach would still need to upload the local workspace to GCS (possibly incrementally!) to fetch it in the build. It looks like Skaffold requires this today anyway for, e.g., kaniko builds.

May 28 '18 15:05 imjasonh

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.\n If this issue is safe to close now please do so with /close.\n Send feedback to Knative Productivity Slack channel or knative/test-infra. /lifecycle stale

Aug 12 '19 18:08 knative-housekeeping-robot

build build copied to clipboard

Debugging a build description is painful

build
build copied to clipboard