nebari
nebari copied to clipboard
Validate ClearML and Prefect services are working with 0.4.0+
- This is a meta issue to track work for 0.4.2 Milestone
ClearML is simple to test but Prefect is much more challenging.
this needs to be checked as well #878
Clearml seems to need GCC compiler when installing the daemon agent in a conda-store env, also the inbuilt qhub agent called services is failing during init of a job -- needs to debug (see an error bellow), the access and dashboard are working as expected.
Running Task 682367ffa7ac4fa9adec0c134d62be21 inside default docker: arguments: []
2022-05-13 17:40:21
Executing: ['docker', 'run', '-t', '-l', 'clearml-worker-id=clearml-services:service:682367ffa7ac4fa9adec0c134d62be21', '-l', 'clearml-parent-worker-id=clearml-services', '-e', 'NVIDIA_VISIBLE_DEVICES=none', '-e', 'CLEARML_WORKER_ID=clearml-services:service:682367ffa7ac4fa9adec0c134d62be21', '-e', 'CLEARML_DOCKER_IMAGE=', '-e', 'CLEARML_TASK_ID=682367ffa7ac4fa9adec0c134d62be21', '-v', '/tmp/.clearml_agent.8j40rjqv.cfg:/root/clearml.conf', '-v', '/root/.clearml/apt-cache:/var/cache/apt/archives', '-v', '/root/.clearml/pip-cache:/root/.cache/pip', '-v', '/root/.clearml/pip-download-cache:/root/.clearml/pip-download-cache', '-v', '/root/.clearml/cache:/clearml_agent_cache', '-v', '/root/.clearml/vcs-cache:/root/.clearml/vcs-cache', '--rm', '', 'bash', '-c', 'echo \'Binary::apt::APT::Keep-Downloaded-Packages "true";\' > /etc/apt/apt.conf.d/docker-clean ; chown -R root /root/.cache/pip ; export DEBIAN_FRONTEND=noninteractive ; export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL libsm6 libxext6 libxrender-dev libglib2.0-0" ; [ ! -z $(which git) ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL git" ; declare LOCAL_PYTHON ; [ ! -z $LOCAL_PYTHON ] || for i in {15..5}; do which python3.$i && python3.$i -m pip --version && export LOCAL_PYTHON=$(which python3.$i) && break ; done ; [ ! -z $LOCAL_PYTHON ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL python3-pip" ; [ -z "$CLEARML_APT_INSTALL" ] || (apt-get update -y ; apt-get install -y $CLEARML_APT_INSTALL) ; [ ! -z $LOCAL_PYTHON ] || export LOCAL_PYTHON=python3 ; $LOCAL_PYTHON -m pip install -U "pip<20.2" ; $LOCAL_PYTHON -m pip install -U clearml-agent ; cp /root/clearml.conf /root/default_clearml.conf ; NVIDIA_VISIBLE_DEVICES=none $LOCAL_PYTHON -u -m clearml_agent execute --full-monitoring --id 682367ffa7ac4fa9adec0c134d62be21']
2022-05-13 17:40:26
docker: invalid reference format.
See 'docker run --help'.
At this point, many on the team believe it is time to remove these integrations from Nebari. They can still be used if properly included in the helm_extension
section of the nebari-config.yaml
.
Scheduled for removal in the March release.
@iameskild - Can we close this issue as complete?