envd icon indicating copy to clipboard operation
envd copied to clipboard

feat(lang): Add proposal for custom base image

Open gaocegege opened this issue 3 years ago • 12 comments

Preview: https://github.com/gaocegege/envd/blob/base/docs/proposals/20220707-base-image.md

Signed-off-by: Ce Gao [email protected]

gaocegege avatar Jul 07 '22 10:07 gaocegege

Some ideas:

We can provide a way for user override base function in their own environment, with some methods marked as internal (starting with _) now.

For example

def base(os, language):
    _base_img= Image("ubuntu:20.04")
    # Execute command over base
    install_htop = _base_img.run("sudo apt install htop")
    
    # as llb merge, create a new state as base the base image
    _base.image = _base.merge([install_htop, ...])

    # Later step will use _base.image as the base

The user can use load(...) to override the base function provided by envd, but with the compatibility.

User can also optimize the cache logic like

   if cache_image_exists():
      _base.image = Image(cached_image)
   else:
      # build from scratch

VoVAllen avatar Jul 08 '22 03:07 VoVAllen

@VoVAllen

I am wondering how to support different bases on different targets. For example, I want to use nvidia/cuda:runtime for serving, and nvidia/cuda:devel for development.

gaocegege avatar Jul 08 '22 04:07 gaocegege

Also there are some ambiguity in the current API design. config.ssh(mode='disable') doesn't sounds good.

Other options:

  • Introduce install.dev_packages(..)
  • Add dev as the argument to the base function, for example, base(os="ubuntu20.04", dev=default_dev_packages())

VoVAllen avatar Jul 08 '22 04:07 VoVAllen

Also there are some ambiguity in the current API design. config.ssh(mode='disable') doesn't sounds good.

Other options:

  • Introduce install.dev_packages(..)
  • Add dev as the argument to the base function, for example, base(os="ubuntu20.04", dev=default_dev_packages())

Yep, the main challenge here is how to let us know if we should install the dev dependencies. The approach you proposed LGTM.

gaocegege avatar Jul 08 '22 04:07 gaocegege

@VoVAllen

I am wondering how to support different bases on different targets. For example, I want to use nvidia/cuda:runtime for serving, and nvidia/cuda:devel for development.

Probably we want extend the base function.

def base(os, requirements):
    # requirements is a limited set maintained by envd, only certain package will be supported
    # Need clear line between requirements here and other `install` packages
    ...

base("ubuntu20.04", default_dev()+cuda("11.1", cudnn="8")) # default_dev()+cuda() is just list concatenation 
# Then envd can use a cuda base image + install other package by apt install 
# user can override base function by their own. The implementation can be flexible

Then for serving and dev

def build():
   base("ubuntu20.04", default_dev()+cuda("11.1", cudnn="8"))

def serving():
   base("ubuntu20.04", default_dev()+cuda("11.1", cudnn="8"))

VoVAllen avatar Jul 08 '22 05:07 VoVAllen

Cannot understand the design, could you please give me an example of it.

The deployment image uses nvidia/cuda:devel, and the serving image uses nvidia/cuda:runtime.

gaocegege avatar Jul 08 '22 05:07 gaocegege

def dev():
   base("ubuntu20.04", default_dev()+cuda("11.1", cudnn="8", version="devel"))

def serving():
   base("ubuntu20.04", cuda("11.1", cudnn="8", version="runtime"))

Then envd can handle the logic inside base function For dev function, envd will use nvidia base image if found ubuntu20.04+CUDA dependency, and install other default_dev() requirements by apt install or other command. Or base can also contain certain cache logic, to use other pre-built images including all requirements described here. The implementation can be flexible.

VoVAllen avatar Jul 08 '22 05:07 VoVAllen

Also user may not want to include config.envd when building serving image

VoVAllen avatar Jul 08 '22 08:07 VoVAllen

def dev():
   base("ubuntu20.04", default_dev()+cuda("11.1", cudnn="8", version="devel"))

def serving():
   base("ubuntu20.04", cuda("11.1", cudnn="8", version="runtime"))

Then envd can handle the logic inside base function For dev function, envd will use nvidia base image if found ubuntu20.04+CUDA dependency, and install other default_dev() requirements by apt install or other command. Or base can also contain certain cache logic, to use other pre-built images including all requirements described here. The implementation can be flexible.

Users want to specify the custom image instead of using cuda function here. For example, users want to use my-own-org/cuda:devel in envd. I proposed base(image=""). I am wondering how to support this case with your suggestion.

gaocegege avatar Jul 12 '22 01:07 gaocegege

As for this and https://github.com/tensorchord/envd/issues/581: Maybe we should use a self custome python installer for those image may do not have apt or package manager? @gaocegege WDYT

aseaday avatar Jul 12 '22 02:07 aseaday

Yep I think so.

As for this and #581: Maybe we should use a self custome python installer for those image may do not have apt or package manager? @gaocegege WDYT

gaocegege avatar Jul 12 '22 03:07 gaocegege

The problem I'm concerning is the composition for multiple dependency. For example, It's not efficient to construct over all available cuda image to satisfy user requirements. Therefore I'm thinking of dynamically construct base image. If the user requirement is only specify one image, your proposal is totally fine

VoVAllen avatar Jul 12 '22 03:07 VoVAllen