openfl icon indicating copy to clipboard operation
openfl copied to clipboard

Run collaborator in docker

Open dmitryagapov opened this issue 4 years ago • 1 comments

nvidia-container-runtime should be installed https://docs.docker.com/config/containers/resource_constraints/#gpu

  1. Add gpgkey for nvidia-container-runtime
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
  1. Install nvidia-container-runtime
sudo apt-get install nvidia-container-runtime
  1. Ensure the nvidia-container-runtime-hook is accessible from $PATH.
which nvidia-container-runtime-hook
  1. Restart the Docker daemon
sudo service docker restart

Docker proxy: In order to use docker with proxy it can be defined in director_config.yaml and envoy_config.yaml

#director_config.yaml
settings:
  listen_host: localhost
  listen_port: 50050
  sample_shape: [ '300', '400', '3' ]
  target_shape: [ '300', '400' ]
  envoy_health_check_period: 5  # in seconds
  docker:
    env:
      http_proxy:
      https_proxy:
      no_proxy:
    buildargs:
      HTTP_PROXY:
      HTTPS_PROXY:
      NO_PROXY:

#envoy_config.yaml
params:
  cuda_devices: [ 0, 2 ]
  docker:
    env:
      http_proxy:
      https_rpoxy:
      no_proxy:
    buildargs:
      HTTP_PROXY:
      HTTPS_PROXY:
      NO_PROXY:

optional_plugin_components:
  cuda_device_monitor:
    template: openfl.plugins.processing_units_monitor.pynvml_monitor.PynvmlCUDADeviceMonitor
    settings: [ ]

shard_descriptor:
  template: kvasir_shard_descriptor.KvasirShardDescriptor
  params:
    data_folder: kvasir_data
    rank_worldsize: 1,10
    enforce_image_hw: '300,400'

Manage Docker as a non-root user: https://docs.docker.com/engine/install/linux-postinstall/

dmitryagapov avatar Dec 16 '21 08:12 dmitryagapov

@dmitryagapov @alexey-gruzdev Can a tag be added to PR's like this to reflect that the feature is experimental / needs pending design review before merge? WIP is used for PR's that aren't ready for review yet, but it seems like this belongs in a different category

psfoley avatar Feb 16 '22 17:02 psfoley