pangeo-stacks How to use pangeo-stacks images with dask-labextention layout in binder repos?

So I'm trying to work on https://github.com/pangeo-data/pangeo-tutorial-agu-2018/issues/14.

I decided to use pangeo/pangeo-notebook-onbuild:2019.04.19 Docker image as found in several recent Pangeo deployments. This seems to work, however, I've lost the dask-labextension layout, and I'm not sure what I should do.

Looking at https://github.com/pangeo-data/pangeo-cloud-federation/tree/staging/deployments/nasa/image/binder or https://github.com/pangeo-data/pangeo-cloud-federation/tree/staging/deployments/ocean/image/binder, there seems to be some post config files, but no dask-labextension layout.

So what is the correct configuration to use to have a basic Pangeo notebook image with a nice dask-labextension layout?

May 02 '19 20:05 guillaumeeb

/cc @ian-r-rose who might know

May 19 '19 21:05 yuvipanda

@guillaumeeb here is a demo I wrote to show how to set up a new layout that works on binder. It takes a bit of work, but is doable. The layout is stored in the jupyterlab-workspace file, which you distribute with the binder.

May 19 '19 21:05 ian-r-rose

I don't think our current onbuild setup support the start file syntax. Something that is currently baked into how we use the jupyterlab-workspace features.

Q for @yuvipanda - were there challenges getting the start file entrypoint to work or is this a feature we could implement? Q for @ian-r-rose - have you heard talk of repo2docker supporting the workspace spec as a known configuration file? This may be an interesting proposal that would eliminate the need for the start file in this use case.

May 20 '19 15:05 jhamman

@jhamman I have not heard any talk of that, but it's a neat idea. There is currently no formal spec for workspace files (though it would be nice to have one), so it would be up to the user to provide a well-formed one for their particular binder setup. But it would certainly help in cutting down on the boilerplate start script flimflam (which, as we have seen, is pretty error-prone)

May 20 '19 15:05 ian-r-rose

@jhamman we can totally support 'start' in onbuild. I didn't implement it mostly to get an MVP out fast. The way to do that would be:

Implement our own Entrypoint that is called all the time
If we have a custom start file, it'll call that. If not, it'll just fall back to the default command being called.

Basically, we need to re-implement https://github.com/jupyter/repo2docker/blob/master/repo2docker/buildpacks/repo2docker-entrypoint in r2d_overlay.py.

IMO, the bug is possibly in workspaces needing base_url. See https://github.com/jupyterlab/jupyterlab/issues/5977 for more details. Changing that would fix the start related issues, and also make this much more robust in a lot of use cases. Based on https://github.com/jupyterlab/jupyterlab/issues/5977#issuecomment-465864078 it's unclear why it is needed :)

May 20 '19 19:05 yuvipanda

Hey.. I am having a hack at this See https://github.com/scollis/pangeo-stacks/blob/addstart/onbuild/r2d_overlay.py#L112

One thing I don't understand (I am a docker noob) is where to put it in here https://github.com/pangeo-data/pangeo-stacks/blob/4c90b98836c66403ab81ca837ce979ec9628a232/onbuild/Dockerfile#L15

Aug 29 '19 18:08 scollis

is it ENTRYPOINT RUN /usr/local/bin/r2d_overlay.py start

Aug 29 '19 18:08 scollis

@scollis something like that! One addition to your start script would be to make sure it works when there's no 'start' script present. In that case, it should default to calling /usr/local/bin/repo2docker-entrypoint (https://github.com/jupyter/repo2docker/blob/master/repo2docker/buildpacks/base.py#L182) which will default to what repo2docker does.

You should probably also just pass the path directly instead of passing it as an arg to /bin/bash.

Thank you for working on this!

Aug 29 '19 19:08 yuvipanda

@yuvipanda if a start script is present should it run it and then run /usr/local/bin/repo2docker-entrypoint

Aug 29 '19 19:08 scollis

@scollis I think it should only run repo2docker-entrypoint if a start script is not present...

Aug 29 '19 19:08 yuvipanda

Awesome.. I am at ORNL and just about to leave.. Pushing a docker image to dockerhub now.. once I am back at the hotel I dont think the wifi can handle a 30GB upload :D

Aug 29 '19 19:08 scollis

@yuvipanda "You should probably also just pass the path directly instead of passing it as an arg to /bin/bash."

I am copying what is done in postbuild..

so you are saying I should do

    #Enable additional actions in the future
    applicators = [apply_start]

    for applicator in applicators:
        commands = applicator()

        if commands:
            for command in commands:
                subprocess.check_call(
                    [ command], preexec_fn=applicator._pre_exec
                )

@become(NB_UID)
def apply_start():
    st_path = binder_path('start')

    if os.path.exists(st_path):
        return [
            f'chmod +x {st_path}',
            # since pb_path is a fully qualified path, no need to add a ./
            f'{st_path}'
        ]

Aug 29 '19 22:08 scollis

pangeo-stacks pangeo-stacks copied to clipboard

How to use pangeo-stacks images with dask-labextention layout in binder repos?

pangeo-stacks
pangeo-stacks copied to clipboard