docker-py icon indicating copy to clipboard operation
docker-py copied to clipboard

Option to report progress while building context

Open hamiltont opened this issue 11 years ago • 9 comments

When building an exceptionally large context, there is no way to report progress. I'd opt to add an option to the util's make context method of progress=False, so that people can manually call the function and set progress=True if they desire it.

To see the problem in action, use this repo (note the docker branch) and run toolset/run-tests.py --install server --docker --test '' (you will need python 2.7 and the requirements listed here. Building teh context can sometimes take up to 15 minutes (note: this is on an NFS filesystem which is notoriously slow when tarring a lot of small files) and normally takes about 5 minutes (on my local SSD)

hamiltont avatar Oct 25 '14 22:10 hamiltont

+1; I too would like to see an option like this too. Even if it just gave back results per step, instead of everything at the end, that would be great.

cglewis avatar Oct 26 '14 17:10 cglewis

Can't you use stream=True and get the streaming output as it is created?

defunctzombie avatar Nov 09 '14 00:11 defunctzombie

@defunctzombie that option reports the output for all the dockerfile commands. However, running docker build requires two steps - 1) tar the current directory 2) run docker build. I'm asking for a way to report the progress of the "tar" step, which can take quite a while in a large directory. Here's a minor explanation of why it tars stolen from here:

the client is tar/compressing the directory (and all subdirectories) where you executed docker build. Yeah, that's right. If you execute this in your root directory, your whole drive will get tar'd and sent to the docker daemon. Caveat something. Generally that's a mistake you only make once. Anyways, the build gets run by the daemon, not the client, so the daemon needs the whole directory that included (hopefully) the Dockerfile and any other local files needed for the build. That's the context.

hamiltont avatar Nov 09 '14 02:11 hamiltont

I also faced with this lack of informativity. It would be nice to have an option. +1

vyivanov avatar Sep 26 '15 16:09 vyivanov

I monkey patched docker/api/build.py and progress info via the class below. This class wraps a file object and will report how much of has been read. Once this class is defined, you can see progress by adding context = StreamPrinter(context) before we make the _post call.

class StreamPrinter(object):
    '''
    Wrap a file object and print out read progress. Used in conjunction with
    the 
    '''
    def __init__(self, file):
        self.read_so_far = 0
        self.starttime = datetime.datetime.utcnow()
        self.file = file

    def __iter__(self, *args, **kwargs):
        self.starttime = datetime.datetime.utcnow()
        return self

    def chk_print(self, fn):
        old_megs = self.read_so_far / 1024 / 1024
        out = fn()
        self.read_so_far += len(out)
        new_megs = self.read_so_far / 1024 / 1024
        if old_megs != new_megs:
            log.debug('Sent %s megabytes. Time elapsed: %s'%(new_megs,
                datetime.datetime.utcnow() - self.starttime))
        return out

    def next(self):
        return self.chk_print(fn = lambda: next(self.file))

    def read(self, *args, **kwargs):
        return self.chk_print(fn = lambda: self.file.read(*args, **kwargs))

This is still a bit messy and I'm not sure that this is the best approach, which is why I didn't make a pull request. But thought others may find it useful.

speedplane avatar Feb 09 '16 10:02 speedplane

+1

danielwhatmuff avatar Feb 22 '18 12:02 danielwhatmuff

+1

StoneJia avatar May 01 '18 22:05 StoneJia

https://gabrieldemarmiesse.github.io/python-on-whales/ can help. Disclaimer: I made this package

gabrieldemarmiesse avatar Nov 08 '20 15:11 gabrieldemarmiesse

docker-py provides "low-level" API which can be used to show progress. It spits out stream of statuses for each layer like this:

{
    "status": "Downloading",
    "progressDetail": {
        "current": 944511,
        "total": 30916861
    },
    "progress": "[=>                                                 ]  944.5kB/30.92MB",
    "id": "dfe669f82390"
}

I wrote a progress bar for these layers - sharing it here if you find it useful.

# Progress bar for docker image pull through docker-py

import docker
from rich.progress import Progress

tasks = {}

# Show task progress (red for download, green for extract)
def show_progress(line, progress):
    if line['status'] == 'Downloading':
        id = f'[red][Download {line["id"]}]'
    elif line['status'] == 'Extracting':
        id = f'[green][Extract  {line["id"]}]'
    else:
        # skip other statuses
        return

    if id not in tasks.keys():
        tasks[id] = progress.add_task(f"{id}", total=line['progressDetail']['total'])
    else:
        progress.update(tasks[id], completed=line['progressDetail']['current'])

def image_pull(image_name):
    print(f'Pulling image: {image_name}')
    with Progress() as progress:
        client = docker.from_env()
        resp = client.api.pull(image_name, stream=True, decode=True)
        for line in resp:
            show_progress(line, progress)

if __name__ == '__main__':
    # Pull a large image
    IMAGE_NAME = 'bitnami/pytorch'
    image_pull(IMAGE_NAME)

The output includes pink progress bar for download, and a green one for extracting different layers.

Screen Shot 2023-02-02 at 5 02 14 PM

hmurari avatar Feb 03 '23 00:02 hmurari