buildkit icon indicating copy to clipboard operation
buildkit copied to clipboard

[Feature Request] --progress=verbose

Open ruro opened this issue 5 years ago • 7 comments

A similar issue was already raised in the past (#526), but it got closed after the --progress option was implemented. In my opinion, the current --progress option conflates "tty support" and "output verbosity". As a result, there is currently no way to get a "verbose" human-readable output format.


For example, running docker build --no-cache --progress=tty with

this Dockerfile

FROM alpine:latest as base

FROM base as a
RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done

FROM base as b
RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done

FROM base
COPY --from=a . ./
COPY --from=b . ./
shows this log during the actual building process
[+] Building 10.1s (4/8)                                                                                                                                                      
 => [internal] load build definition from Dockerfile                                                                                                                     0.9s
 => => transferring dockerfile: 263B                                                                                                                                     0.0s
 => [internal] load .dockerignore                                                                                                                                        1.3s
 => => transferring context: 2B                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                                                                                                         0.0s
 => CACHED [base 1/1] FROM docker.io/library/alpine:latest                                                                                                               0.0s
 => [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done                                                                                                       8.6s
 => => # b2                                                                                                                                                                  
 => => # b3                                                                                                                                                                  
 => => # b4                                                                                                                                                                  
 => => # b5                                                                                                                                                                  
 => => # b6                                                                                                                                                                  
 => => # b7                                                                                                                                                                  
 => [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done                                                                                                       8.6s
 => => # a1                                                                                                                                                                  
 => => # a2                                                                                                                                                                  
 => => # a3                                                                                                                                                                  
 => => # a4                                                                                                                                                                  
 => => # a5                                                                                                                                                                  
 => => # a6                                                                                                                                                                  
but then collapses the output to this shorter log after the build is complete.
[+] Building 19.5s (9/9) FINISHED                                                                                                                                             
 => [internal] load build definition from Dockerfile                                                                                                                     0.9s
 => => transferring dockerfile: 263B                                                                                                                                     0.0s
 => [internal] load .dockerignore                                                                                                                                        1.3s
 => => transferring context: 2B                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/alpine:latest                                                                                                         0.0s
 => CACHED [base 1/1] FROM docker.io/library/alpine:latest                                                                                                               0.0s
 => [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done                                                                                                      12.5s
 => [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done                                                                                                      13.3s
 => [stage-3 1/2] COPY --from=a . ./                                                                                                                                     1.4s 
 => [stage-3 2/2] COPY --from=b . ./                                                                                                                                     1.3s 
 => exporting to image                                                                                                                                                   1.1s 
 => => exporting layers                                                                                                                                                  1.1s 
 => => writing image sha256:afaa2ffa1a1dc5f77d93851187c3ebd623599a7ae5cab7ac9a969c2e5b1a0487                                                                             0.0s 

This behavior is undesirable, since most developers (in my experience) want to be able to look at the full logs during the build and after the build. The currently "recommended" way to get the full logs is to use --progress=plain.

However, running docker build --no-cache --progress=plain with the same Dockerfile results in

this unreadable mess, where the logs from different parallel stages are mixed together.

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.8s

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 263B done
#1 DONE 1.1s

#3 [internal] load metadata for docker.io/library/alpine:latest
#3 DONE 0.0s

#4 [base 1/1] FROM docker.io/library/alpine:latest
#4 CACHED

#5 [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done
#5 3.135 a1
#5 4.136 a2
#5 5.137 a3
#5 6.137 a4
#5 7.138 a5
#5 8.139 a6
#5 9.139 a7
#5 ...

#7 [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done
#7 3.062 b1
#7 4.062 b2
#7 5.063 b3
#7 6.063 b4
#7 7.064 b5
#7 8.065 b6
#7 9.066 b7
#7 10.07 b8
#7 ...

#5 [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done
#5 10.14 a8
#5 11.14 a9
#5 12.14 a10
#5 ...

#7 [b 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done
#7 11.07 b9
#7 12.07 b10
#7 DONE 13.1s

#5 [a 1/1] RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done
#5 DONE 13.9s

#6 [stage-3 1/2] COPY --from=a . ./
#6 DONE 1.2s

#8 [stage-3 2/2] COPY --from=b . ./
#8 DONE 2.4s

#9 exporting to image
#9 exporting layers
#9 exporting layers 1.4s done
#9 writing image sha256:1a7aca4ff2346858fe26dbc7fac58687c369622baf2ab8ad849ac8bcbe47a9b1 0.0s done
#9 DONE 1.4s

So in short, --progress=tty only shows the last 6 lines from each build and clears the logs after the build is done and --progress=plain can't handle parallel builds properly and is overall actually quite ugly. In my opinion, we really need a --progress=verbose option, which would act exactly like --progress=tty, but the logs will not be limited to 6 lines and they will not get erased after the build step is complete.

I might be wrong, but I can't imagine that this behavior would be hard to implement, since you just need to turn off the log "truncation" at 7 lines and after the build step finishes.

ruro avatar Jul 05 '20 01:07 ruro

I might be wrong, but I can't imagine that this behavior would be hard to implement,

This actually seems impossible to implement. There are two modes of writing the progress: to the terminal and to the text stream. If we write to text stream then there is no way to inject anything in between the already written rows. So the only way is to block or to do (kind-of) smart switching like we do today. Otoh on tty mode we only have the screen coordinates to work with, we can replace the lines in current screen but we can't write anything outside of it or have content flowing.

I think the closest that you are asking and that would be useful would be to just keep the current tty mode but once the build is over and it reprints the summary, print all the logs as well and not hide them.

tonistiigi avatar Jul 06 '20 18:07 tonistiigi

Also, some tips are to pipe the plain progress to sort if you don't care about it blocking. What people have asked before is to also make the vertex switching configurable so you can set ENV that define how much you want the stream to block or switch.

tonistiigi avatar Jul 06 '20 18:07 tonistiigi

@tonistiigi I am not sure, that your assertion about how the tty mode currently works is true (at least it appears to work differently to me). For example, consider that when the build process starts, the terminal is not cleared and the only line that appears below the shell prompt is

[+] Building 0.0s (0/2)

Then, as the build progresses, lines are constantly added after this line. Meanwhile this line is constantly updated to keep track of the time, ending with

[+] Building 19.5s (9/9) FINISHED

The same is true about the "scrolling" logs. At the beginning, the logs are only 1 line high and are then growing until they reach 6 lines. So clearly, the size of the "virtual" terminal is constantly changing throughout the build. I see no theoretical reason, why the logs couldn't just continue growing past the current 6 lines.

Unfortunately, your suggestion about sorting the output of --progress=plain doesn't quite work.

  1. Piping to sort or even sort -n/-h doesn't work because the # sign before the number forces alphabetical sorting, which puts #10 between #1 and #2.
  2. All the empty lines get sorted to the top.
  3. The lines inside each numbered block also get sorted.
  4. The blocks are not contiguous and are instead broken up by #123 ...

Theoretically, I could write some program, that would properly parse the plain output and format it in the way I described in my original post, but it isn't really a feasible solution IMO as that would require me to ship a parser/printer program everywhere I want to build docker images, which is kind of ridiculous to be honest.

I really think, that having a workable human-readable output format is really the job of buildkit. If buildkit wants to eventually become the default docker build implementation, you will at least need to have a comparable quality of output and parallel builds are not a good enough excuse IMO.

ruro avatar Jul 06 '20 19:07 ruro

are then growing until they reach 6 lines

The 6 limit is of course arbitrary. I'm just saying that it can't be ever-growing. The way it keeps the 6 line limit is monitoring your current screen size, finding out if there is space for the logs, finding the lines where to write the logs are on your screen, clearing these lines, and then rewriting with new content.

@tiborvass I think I saw you using sort, did you set some special flags. Obviously this is just a workaround and not perfect.

tonistiigi avatar Jul 06 '20 20:07 tonistiigi

Ah. After some experimenting with the current --progress=tty I now understand what you mean. The current tty output mode is indeed dependent on the tty size. For example, if there are more lines in the output than the height of your terminal, then the output doesn't scroll, but instead gets truncated.

Dockerfile with really long output
FROM alpine:latest as base   

FROM base as a00 RUN touch a00
FROM a00 as a01 RUN touch a01
FROM a01 as a02 RUN touch a02
FROM a02 as a03 RUN touch a03
FROM a03 as a04 RUN touch a04
FROM a04 as a05 RUN touch a05
FROM a05 as a06 RUN touch a06
FROM a06 as a07 RUN touch a07
FROM a07 as a08 RUN touch a08
FROM a08 as a09 RUN touch a09
FROM a09 as a10 RUN touch a10
FROM a10 as a11 RUN touch a11
FROM a11 as a12 RUN touch a12
FROM a12 as a13 RUN touch a13
FROM a13 as a14 RUN touch a14
FROM a14 as a15 RUN touch a15
FROM a15 as b00 RUN touch b00
FROM b00 as b01 RUN touch b01
FROM b01 as b02 RUN touch b02
FROM b02 as b03 RUN touch b03
FROM b03 as b04 RUN touch b04
FROM b04 as b05 RUN touch b05
FROM b05 as b06 RUN touch b06
FROM b06 as b07 RUN touch b07
FROM b07 as b08 RUN touch b08
FROM b08 as b09 RUN touch b09
FROM b09 as b10 RUN touch b10
FROM b10 as b11 RUN touch b11
FROM b11 as b12 RUN touch b12
FROM b12 as b13 RUN touch b13
FROM b13 as b14 RUN touch b14
FROM b14 as b15 RUN touch b15
FROM b15 as c00 RUN touch c00
FROM c00 as c01 RUN touch c01
FROM c01 as c02 RUN touch c02
FROM c02 as c03 RUN touch c03
FROM c03 as c04 RUN touch c04
FROM c04 as c05 RUN touch c05
FROM c05 as c06 RUN touch c06
FROM c06 as c07 RUN touch c07  
FROM c07 as c08 RUN touch c08
FROM c08 as c09 RUN touch c09
FROM c09 as c10 RUN touch c10
FROM c10 as c11 RUN touch c11
FROM c11 as c12 RUN touch c12
FROM c12 as c13 RUN touch c13
FROM c13 as c14 RUN touch c14
FROM c14 as c15 RUN touch c15
FROM c15 as d00 RUN touch d00
FROM d00 as d01 RUN touch d01
FROM d01 as d02 RUN touch d02
FROM d02 as d03 RUN touch d03
FROM d03 as d04 RUN touch d04
FROM d04 as d05 RUN touch d05
FROM d05 as d06 RUN touch d06
FROM d06 as d07 RUN touch d07
FROM d07 as d08 RUN touch d08
FROM d08 as d09 RUN touch d09
FROM d09 as d10 RUN touch d10
FROM d10 as d11 RUN touch d11
FROM d11 as d12 RUN touch d12
FROM d12 as d13 RUN touch d13  
FROM d13 as d14 RUN touch d14
FROM d14 as final RUN touch d15

Damn, what a bummer. I am once again confronted by the fact that sh/bash/term are such a huge load of horrible legacy nightmares. And this limitation really seems to be quite deep in the animal brain of VT100. 😩

Hmm, okay. Can we think of some workaround/compromise involving partial "blocking/buffering" with --progress=plain during parallel stages? The current breaking of blocks with ... is really unusable IMO.

I guess, I am thinking of something like the current output format of docker build, but in the event, that there are parallel build steps, the logs are printed in their order of appearance in the Dockerfile.

Maybe something along the lines of this

(Assuming same Dockerfile as in original post)

Sending build context to Docker daemon  2.048kB
 ---> DONE 4.4s

Step [base 1/1] : FROM docker.io/library/alpine:latest
 ---> e7d92cdc71fe CACHED

Step [a 1/1] : RUN for i in $(seq 1 10); do sleep 1; echo "a$i"; done   // Step [b 1/1] is also concurrently running at this point 
a1                                                                      // But its output is getting buffered and will only
a2                                                                      // be displayed after Step [a 1/1] completes
a3
a4
a5
a6
a7
a8
a9
a10
 ---> cc80f0af4e2e DONE 14.0s

Step [b 1/1] : RUN for i in $(seq 1 10); do sleep 1; echo "b$i"; done
b1
b2
b3
b4
b5
b6
b7
b8
b9
b10
 ---> d82096840c4c DONE 15.3s

Step [stage-3 1/2] : COPY --from=a . ./
 ---> 00b49df6d0f4

Step [stage-3 2/2] : COPY --from=b . ./
 ---> a003b71c5ad6

Successfully built a003b71c5ad6

ruro avatar Jul 06 '20 21:07 ruro

I wonder if anyone was able to solve the mixed up stdout that github actions log streams when buildkit runs with --progress=plain ? it builds stages of dockerfile in parallel and writes logs in a mix as well, so it's impossible to understand what is what. is there a solution for that? (since i see this thread is still open, maybe a WIP?)

bss-dmitry-shmakov avatar Feb 22 '24 16:02 bss-dmitry-shmakov