Docker fails due to missing Docker socket
When trying to run invoke commands, i ran into:
# invoke print_env
12:02:00 - INFO hdbg.py init_logger:1018 > cmd='/venv/bin/invoke print_env'
12:02:00 - WARN hserver.py _raise_invalid_host:777 Don't recognize host: host_os_name=Linux, am_host_os_name=None
[sudo] password for ubuntu:
sudo: a password is required
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Run 'docker run --help' for more information
[sudo] password for ubuntu:
sudo: a password is required
Traceback (most recent call last):
File "/venv/bin/invoke", line 8, in <module>
sys.exit(program.run())
^^^^^^^^^^^^^
File "/venv/lib/python3.12/site-packages/invoke/program.py", line 398, in run
self.execute()
File "/venv/lib/python3.12/site-packages/invoke/program.py", line 583, in execute
executor.execute(*self.tasks)
File "/venv/lib/python3.12/site-packages/invoke/executor.py", line 140, in execute
result = call.task(*args, **call.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/venv/lib/python3.12/site-packages/invoke/tasks.py", line 138, in __call__
result = self.body(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/helpers/lib_tasks_print.py", line 92, in print_env
henv.env_to_str(
File "/app/helpers/henv.py", line 543, in env_to_str
msg += get_system_signature()[0] + "\n"
^^^^^^^^^^^^^^^^^^^^^^
File "/app/helpers/henv.py", line 505, in get_system_signature
txt_tmp = hserver.get_docker_info()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/helpers/hserver.py", line 683, in get_docker_info
docker_needs_sudo_ = docker_needs_sudo()
^^^^^^^^^^^^^^^^^^^
File "/app/helpers/hserver.py", line 631, in docker_needs_sudo
assert False, "Failed to run docker"
^^^^^
AssertionError: Failed to run docker
This happened after commit b36a90c on masters.
In helpers/hserver.py, docker_needs_sudo enforces that sudo is required.
# Taken from `helpers/hserver.py`
def docker_needs_sudo() -> bool:
"""
Return whether Docker commands need to be run with sudo.
"""
if not has_docker():
return False
# Another way to check is to see if your user is in the docker group:
# > groups | grep docker
rc = os.system("docker run hello-world 2>&1 >/dev/null")
if rc == 0:
return False
#
rc = os.system("sudo docker run hello-world 2>&1 >/dev/null")
if rc == 0:
return True
assert False, "Failed to run docker"
Previously, when [sudo] password for ubuntu: is prompted, we could skip it with ctrl+c, and continue on. But with the new logic, sudo is a must or it raises an assertion early.
FYI: @sonniki @gpsaggese
Good decision to file an issue @aangelo9 . We need to bypass this check for people not working on the server by inserting hserver.is_external_dev() somewhere, probably upstream from this function. @aangelo9 can you investigate a bit and make a proposal (in a PR) where would be the most fitting place to put it?
Understood. I will draft a PR.
I've been rewriting all that logic and I can't test on all the external devices.
IIUC, my take is that either one can run docker without sudo or if it needs sudo, it needs to be password-less.
We call docker from everywhere in the codebase and one can't skip or enter the password every single time.
So my solution is to ask contributors to:
- add their users to the sudoers
- make
sudopassword less This is the set up we have everywhere.
This means that we should update the documentation and not change the code.
I can help document how to improve the documentation.
Corollary: the problem is that we ignored a problem in the set-up since it was new and then this problem came back to bite us.
So my solution is to ask contributors to: add their users to the sudoers make sudo password less This is the set up we have everywhere. This means that we should update the documentation and not change the code. I can help document how to improve the documentation.
Let's do that then, only let's try to give this priority since it's blocking some interns from running Linter and other invokes that require Docker. Unfortunately, I don't have enough understanding of how this issue should be solved in the setup to be able to guide here.
I have done abit more investigating and I think that the current code does not check for Linux VM. I have done:
# Add user to Docker group
sudo usermod -aG docker $USER
# Add user to sudoers
sudo visudo
# Add at the bottom $USER ALL=(ALL) NOPASSWD:ALL
I used this temp fix to try to access the container:
def docker_needs_sudo() -> bool:
"""
Return whether Docker commands need to be run with sudo.
"""
if os.path.exists("/.dockerenv"):
# We're inside a Docker container — skip check
return False
...
I found out that for Linux, the Docker daemon socket is never mounted, hence DinD check fails.
/var/run/docker.sock:/var/run/docker.sock
rc = os.system("docker run hello-world 2>&1 >/dev/null")
if rc == 0:
return False
#
rc = os.system("sudo docker run hello-world 2>&1 >/dev/null")
if rc == 0:
return True
When in the Docker container, calling Docker will raise this error:
root@4ce53e2efec2:/app# docker images ls
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Ok to put a hack to unblock, but let's clearly mark with a TODO(gp): Remove this as per HelpersTask578.
-
What is your set up exactly?
-
What does it mean "the current code does not check for Linux VM"?
-
Adding users to sudoers is correct. Do you know if we have those instructions in the setup? If not, we should add
-
There are two ways to run a Docker container when inside Docker: sibling and child containers. I don't understand why
/var/run/docker.sockis not mounted, which is required for sibling containers.
@samarth9008 and @heanhsok do you have more insights since you are Linux users?
- Current Env Setup:
- Windows 11 OS
- VMWare Workstation Pro
- Ubuntu x64 (22.04)
Or in codebase variables:
WARN hserver.py _raise_invalid_host:790 Don't recognize host: host_os_name=Linux, am_host_os_name=None
- From what I understand, lib_tasks_docker.py creates the
tmp.docker-compose.ymlfile that configures the Docker container. However, under_generate_docker_compose_file():
# Taken from helpers/lib_tasks_docker.py
def _generate_docker_compose_file()
...
if use_sibling_container:
# Use sibling-container approach.
base_app_spec["volumes"].append(
"/var/run/docker.sock:/var/run/docker.sock"
)
...
use_sibling_container is a bool variable from hserver.use_docker_sibling_containers():
# Taken from helpers/hserver.py
def use_docker_sibling_containers() -> bool:
"""
Return whether to use Docker sibling containers.
Using sibling containers requires that all Docker containers in the
same network so that they can communicate with each other.
"""
val = is_dev4() or _is_mac_version_with_sibling_containers()
return val
Where is_dev4() checks for interval devs, and _is_mac_version_with_sibling_containers() checks for mac. Hence why it does not check for Linux or if it's an external dev and /var/run/docker.sock does not get mounted.
-
Current setup does not ask users to add themselves to sudoers. I could update the document.
-
There are 2 approaches to this:
- This enables running basic invoke functions but does not support DinD:
def docker_needs_sudo() -> bool:
"""
Return whether Docker commands need to be run with sudo.
"""
if os.path.exists("/.dockerenv"):
# We're inside a Docker container — skip check
return False
- The other approach is to maybe place
hserver.is_external_dev()intohserver.use_docker_sibling_container()as an additional condition to get/var/run/docker.sockmounted if DinD is absolutely necessary for interns.
Current setup does not ask users to add themselves to sudoers. I could update the document.
Yes pls
The new approach I've been working on is about checking what functionalities are actually available in the system, rather than checking what computer is "interns" vs "mac" vs "dev" and then have a table that says "for this type of set-up, this is what we have". I'm ok with adding some hacks to keep working but they need to be clearly documented, so that we can remove them.
Understood, I will add some hacks and mark them with TODO(gp): Remove this as per HelpersTask578 and update the setup document.
Current hack is:
- Run Docker container as root user for external linux users.
- Skip DinD for external linux users.
Problem:
- For Ubuntu, the user inside the container is "ubuntu" and does not have read or write permissions, which prevents linter from working.
- The "ubuntu" user is not in the sudoers file, and adding it requires modifying the Docker image.
You can file an issue for this. It's a bit weird that there is a problem, since we use ubuntu on the server and everything is fine.
Maybe the ubuntu user on your user has a different id than the one on the server.
In that case, the fix is adding the id of the ubuntu user to our Docker containers.
@heanhsok any idea about this?
12:02:00 - INFO hdbg.py init_logger:1018 > cmd='/venv/bin/invoke print_env' 12:02:00 - WARN hserver.py _raise_invalid_host:777 Don't recognize host: host_os_name=Linux, am_host_os_name=None [sudo] password for ubuntu: sudo: a password is required docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Let's rewind a bit. It looks like the errors start from here.
Is this error coming from when running on host or inside the container? Could you try both and share the output? Please make sure to clear all the hacks first and use only the code from master.
- To run on host,
i print_env
- To run in container,
i docker_bash
...
/venv/bin/invoke print_env
If neither succeed, can you try running the following commands and share the output?
> heanhs@dev1:~/src/cmamp2$ echo $USER
heanhs
> heanhs@dev1:~/src/cmamp2$ echo $UID
1042
> heanhs@dev1:~/src/cmamp2$ groups $USER
heanhs : docker
> heanhs@dev1:~/src/cmamp2$ which docker
/usr/bin/docker
> heanhs@dev1:~/src/cmamp2$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 Apr 6 02:06 /var/run/docker.sock
@aangelo9
- Output for
i print_env.
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ i print_env
12:46:00 - INFO hdbg.py init_logger:1018 > cmd='/home/alvinoangelo/src/venv/client_venv.helpers/bin/invoke print_env'
12:46:00 - WARN hserver.py _raise_invalid_host:782 Don't recognize host: host_os_name=Linux, am_host_os_name=None
12:46:00 - WARN henv.py _get_psutil_info:372 psutil is not installed: No module named 'psutil'
# Repo config
get_host_name='github.com'
get_html_dir_to_url_mapping='{'s3://cryptokaizen-html': 'http://172.30.2.44', 's3://cryptokaizen-html/v2': 'http://172.30.2.44/v2'}'
get_invalid_words='[]'
get_docker_base_image_name='helpers'
# Server config
enable_privileged_mode='False'
get_docker_shared_group=''
get_docker_user=''
get_host_user_name='alvinoangelo'
get_shared_data_dirs='None'
has_dind_support='False'
has_docker_sudo='True'
is_AM_S3_available='True'
is_CK_S3_available='True'
is_dev4='False'
is_dev_csfy='False'
is_external_linux='True'
is_host_mac='False'
is_ig_prod='False'
is_inside_ci='False'
is_inside_docker='False'
is_inside_ecs_container='False'
is_inside_unit_test='False'
is_prod_csfy='False'
run_docker_as_root='False'
skip_submodules_test='False'
use_docker_db_container_name_to_connect='False'
use_docker_network_mode_host='False'
use_docker_sibling_containers='False'
use_main_network='False'
# System signature
# Container version
container_version='None'
changelog_version='1.2.0'
# Git info
branch_name='master'
hash='32f2268'
# Last commits:
* 32f2268 Sonya Nikiforova HelpersTask393: Rename doc (#610) ( 8 hours ago) Thu Apr 24 05:11:39 2025 (HEAD -> master, origin/master, origin/HEAD)
* 7d8baea aangelo9 HelpersTask393_Review_systems_to_automate_code_review (#604) ( 9 hours ago) Thu Apr 24 03:53:46 2025
* fe23e50 Sandeep Thalapanane HelpersTask596_Links_are_incorrectly_converted_inside_fenced_blocks (#608) ( 21 hours ago) Wed Apr 23 15:44:53 2025
# Platform info
system=Linux
node name=alvinoangelo
release=6.8.0-57-generic
version=#59~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Mar 19 17:07:41 UTC 2
machine=x86_64
processor=x86_64
# psutils info
psutil is not installed
# Docker info
has_docker=True
docker_version='28.1.1'
docker_needs_sudo=False
has_privileged_mode=True
is_inside_docker=False
has_sibling_containers_support=*undef*
has_docker_dind_support=*undef*
# Packages
python: 3.10.12
cvxopt: ?
cvxpy: ?
gluonnlp: ?
gluonts: ?
joblib: ?
mxnet: ?
numpy: 2.2.4
pandas: 2.2.3
pyarrow: ?
scipy: ?
seaborn: ?
sklearn: ?
statsmodels: ?
# Env vars
CSFY_AWS_ACCESS_KEY_ID=undef
CSFY_AWS_DEFAULT_REGION=undef
CSFY_AWS_S3_BUCKET='cryptokaizen-data'
CSFY_AWS_SECRET_ACCESS_KEY=undef
CSFY_AWS_SESSION_TOKEN=undef
CSFY_CI=undef
CSFY_ECR_BASE_PATH='causify'
CSFY_ENABLE_DIND=undef
CSFY_FORCE_TEST_FAIL=undef
CSFY_HOST_NAME='alvinoangelo'
CSFY_HOST_OS_NAME='Linux'
CSFY_HOST_USER_NAME='alvinoangelo'
CSFY_HOST_VERSION=undef
CSFY_REPO_CONFIG_CHECK=undef
CSFY_REPO_CONFIG_PATH=undef
GH_ACTION_ACCESS_TOKEN=undef
- Output for
i docker_bash, I was not able to enter the container.
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ i docker_bash
12:46:43 - INFO hdbg.py init_logger:1018 > cmd='/home/alvinoangelo/src/venv/client_venv.helpers/bin/invoke docker_bash'
# docker_bash: base_image='', stage='dev', version='', use_entrypoint=True, as_user=True, generate_docker_compose_file=True, container_dir_name='.', skip_pull=False, skip_docker_image_compatibility_check=False
12:46:43 - WARN hserver.py _raise_invalid_host:782 Don't recognize host: host_os_name=Linux, am_host_os_name=None
# docker_pull: stage='dev', version=None, skip_pull=False
# docker_login: target_registry='aws_ecr.ck'
12:46:44 - WARN lib_tasks_docker.py docker_login:405 Skipping Docker login process for Helpers or Tutorials
12:46:44 - INFO lib_tasks_docker.py _docker_pull:230 image='causify/helpers:dev'
docker pull causify/helpers:dev
dev: Pulling from causify/helpers
Digest: sha256:43ac049013f992d7efc4a8196bfa15dc0b3f7559e52848adf825c3c7b5c84ca3
Status: Image is up to date for causify/helpers:dev
docker.io/causify/helpers:dev
IMAGE=causify/helpers:dev \
docker compose \
--file /home/alvinoangelo/src/helpers1/devops/compose/tmp.docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name alvinoangelo.helpers.app.helpers1.20250424_124644 \
--user $(id -u):$(id -g) \
app \
bash
WARN[0000] The "CSFY_FORCE_TEST_FAIL" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_ACCESS_KEY_ID" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_DEFAULT_REGION" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_SECRET_ACCESS_KEY" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_SESSION_TOKEN" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_TELEGRAM_TOKEN" variable is not set. Defaulting to a blank string.
WARN[0000] The "OPENAI_API_KEY" variable is not set. Defaulting to a blank string.
WARN[0000] /home/alvinoangelo/src/helpers1/devops/compose/tmp.docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion
##> devops/docker_run/entrypoint.sh
UID=1000
GID=1000
CSFY_USE_HELPERS_AS_NESTED_MODULE=0
CSFY_HOST_GIT_ROOT_PATH=/home/alvinoangelo/src/helpers1
CSFY_GIT_ROOT_PATH=/app
CSFY_HELPERS_ROOT_PATH=/app
> source /app/dev_scripts_helpers/thin_client/thin_client_utils.sh ...
AM_CONTAINER_VERSION='1.2.0'
CSFY_USE_HELPERS_AS_NESTED_MODULE=0
##> devops/docker_run/docker_setenv.sh
> source /app/dev_scripts_helpers/thin_client/thin_client_utils.sh ...
# activate_docker_venv()
# set_path()
PATH=.:./.github:./devops:./helpers:./.vscode:./.git:./papers:./dev_scripts_helpers:./.mypy_cache:./config_root:./docs:./import_check:./linters::/app:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# set_up_docker_git()
git --version: git version 2.43.0
/app
# set_pythonpath()
Adding /app to PYTHONPATH
PYTHONPATH=/app:
# Configure env
WARNING: /var/run/docker.sock doesn't exist
# set_up_docker_git()
git --version: git version 2.43.0
/app
# invoke print_env
12:46:46 - INFO hdbg.py init_logger:1018 > cmd='/venv/bin/invoke print_env'
12:46:46 - WARN hserver.py _raise_invalid_host:782 Don't recognize host: host_os_name=Linux, am_host_os_name=None
[sudo] password for ubuntu:
sudo: a password is required
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Run 'docker run --help' for more information
[sudo] password for ubuntu:
sudo: a password is required
Traceback (most recent call last):
File "/venv/bin/invoke", line 8, in <module>
sys.exit(program.run())
^^^^^^^^^^^^^
File "/venv/lib/python3.12/site-packages/invoke/program.py", line 398, in run
self.execute()
File "/venv/lib/python3.12/site-packages/invoke/program.py", line 583, in execute
executor.execute(*self.tasks)
File "/venv/lib/python3.12/site-packages/invoke/executor.py", line 140, in execute
result = call.task(*args, **call.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/venv/lib/python3.12/site-packages/invoke/tasks.py", line 138, in __call__
result = self.body(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/helpers/lib_tasks_print.py", line 92, in print_env
henv.env_to_str(
File "/app/helpers/henv.py", line 543, in env_to_str
msg += get_system_signature()[0] + "\n"
^^^^^^^^^^^^^^^^^^^^^^
File "/app/helpers/henv.py", line 505, in get_system_signature
txt_tmp = hserver.get_docker_info()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/helpers/hserver.py", line 688, in get_docker_info
docker_needs_sudo_ = docker_needs_sudo()
^^^^^^^^^^^^^^^^^^^
File "/app/helpers/hserver.py", line 636, in docker_needs_sudo
assert False, "Failed to run docker"
^^^^^
AssertionError: Failed to run docker
- Other outputs.
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ echo $USER
alvinoangelo
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ echo $UID
1000
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ groups $USER
alvinoangelo : alvinoangelo adm cdrom sudo dip plugdev lpadmin lxd sambashare docker
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ which docker
/home/alvinoangelo/bin/docker
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 Apr 24 10:32 /var/run/docker.sock
Can u share your devops/compose/tmp.docker-compose.yml as well? I suspect this var CSFY_ENABLE_DIND is set to 0
CSFY_ENABLE_DIND is set to 0.
version: '3'
services:
base_app:
cap_add:
- SYS_ADMIN
environment:
- CSFY_ENABLE_DIND=0
- CSFY_FORCE_TEST_FAIL=$CSFY_FORCE_TEST_FAIL
- CSFY_HOST_NAME=alvinoangelo
- CSFY_HOST_OS_NAME=Linux
- CSFY_HOST_USER_NAME=alvinoangelo
- CSFY_HOST_VERSION=6.8.0-57-generic
- CSFY_REPO_CONFIG_CHECK=True
- CSFY_REPO_CONFIG_PATH=
- CSFY_AWS_ACCESS_KEY_ID=$CSFY_AWS_ACCESS_KEY_ID
- CSFY_AWS_DEFAULT_REGION=$CSFY_AWS_DEFAULT_REGION
- CSFY_AWS_PROFILE=$CSFY_AWS_PROFILE
- CSFY_AWS_S3_BUCKET=$CSFY_AWS_S3_BUCKET
- CSFY_AWS_SECRET_ACCESS_KEY=$CSFY_AWS_SECRET_ACCESS_KEY
- CSFY_AWS_SESSION_TOKEN=$CSFY_AWS_SESSION_TOKEN
- CSFY_ECR_BASE_PATH=$CSFY_ECR_BASE_PATH
- CSFY_HOST_GIT_ROOT_PATH=/home/alvinoangelo/src/helpers1
- CSFY_GIT_ROOT_PATH=/app
- CSFY_HELPERS_ROOT_PATH=/app
- CSFY_USE_HELPERS_AS_NESTED_MODULE=0
- CSFY_TELEGRAM_TOKEN=$CSFY_TELEGRAM_TOKEN
- CSFY_CI=$CSFY_CI
- OPENAI_API_KEY=$OPENAI_API_KEY
- GH_ACTION_ACCESS_TOKEN=$GH_ACTION_ACCESS_TOKEN
- GH_TOKEN=$GH_ACTION_ACCESS_TOKEN
image: ${IMAGE}
restart: 'no'
volumes:
- ~/.aws:/home/.aws
- ~/.config/gspread_pandas/:/home/.config/gspread_pandas/
- ~/.config/gh:/home/.config/gh
- ~/.ssh:/home/.ssh
app:
extends: base_app
volumes:
- /home/alvinoangelo/src/helpers1:/app
working_dir: /app
linter:
extends: base_app
volumes:
- /home/alvinoangelo/src/helpers1:/src
- ../../:/app
working_dir: /src
environment:
- MYPYPATH
jupyter_server:
command: devops/docker_run/run_jupyter_server.sh
environment:
- PORT=${PORT}
extends: app
network_mode: ${NETWORK_MODE:-bridge}
ports:
- ${PORT}:${PORT}
jupyter_server_test:
command: jupyter notebook -h 2>&1 >/dev/null
environment:
- PORT=${PORT}
extends: app
network_mode: ${NETWORK_MODE:-bridge}
ports:
- ${PORT}:${PORT}
CSFY_ENABLE_DIND is set to 0.
It makes sense. I think this is where the problem is.
Could you try adding another clause here to allow external dev to use privilege mode ? and rerurn the i docker_bash
elif is_external_linux():
ret = True
https://github.com/causify-ai/helpers/blob/32f2268843359438bc9adcdd1124f4e05ab019b1/helpers/hserver.py#L784-L814
I've tried that approach, but it still asks for sudo password and stays stuck when ctrl+c.
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ i docker_bash
13:44:10 - INFO hdbg.py init_logger:1018 > cmd='/home/alvinoangelo/src/venv/client_venv.helpers/bin/invoke docker_bash'
# docker_bash: base_image='', stage='dev', version='', use_entrypoint=True, as_user=True, generate_docker_compose_file=True, container_dir_name='.', skip_pull=False, skip_docker_image_compatibility_check=False
# docker_pull: stage='dev', version=None, skip_pull=False
# docker_login: target_registry='aws_ecr.ck'
13:44:10 - WARN lib_tasks_docker.py docker_login:405 Skipping Docker login process for Helpers or Tutorials
13:44:10 - INFO lib_tasks_docker.py _docker_pull:230 image='causify/helpers:dev'
docker pull causify/helpers:dev
dev: Pulling from causify/helpers
Digest: sha256:43ac049013f992d7efc4a8196bfa15dc0b3f7559e52848adf825c3c7b5c84ca3
Status: Image is up to date for causify/helpers:dev
docker.io/causify/helpers:dev
IMAGE=causify/helpers:dev \
docker compose \
--file /home/alvinoangelo/src/helpers1/devops/compose/tmp.docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name alvinoangelo.helpers.app.helpers1.20250424_134410 \
--user $(id -u):$(id -g) \
app \
bash
WARN[0000] The "CSFY_FORCE_TEST_FAIL" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_ACCESS_KEY_ID" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_DEFAULT_REGION" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_SECRET_ACCESS_KEY" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_SESSION_TOKEN" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_TELEGRAM_TOKEN" variable is not set. Defaulting to a blank string.
WARN[0000] The "OPENAI_API_KEY" variable is not set. Defaulting to a blank string.
WARN[0000] /home/alvinoangelo/src/helpers1/devops/compose/tmp.docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion
##> devops/docker_run/entrypoint.sh
UID=1000
GID=1000
CSFY_USE_HELPERS_AS_NESTED_MODULE=0
CSFY_HOST_GIT_ROOT_PATH=/home/alvinoangelo/src/helpers1
CSFY_GIT_ROOT_PATH=/app
CSFY_HELPERS_ROOT_PATH=/app
> source /app/dev_scripts_helpers/thin_client/thin_client_utils.sh ...
AM_CONTAINER_VERSION='1.2.0'
CSFY_USE_HELPERS_AS_NESTED_MODULE=0
##> devops/docker_run/docker_setenv.sh
> source /app/dev_scripts_helpers/thin_client/thin_client_utils.sh ...
# activate_docker_venv()
# set_path()
PATH=.:./.github:./devops:./helpers:./.vscode:./.git:./papers:./dev_scripts_helpers:./.mypy_cache:./config_root:./docs:./import_check:./linters::/app:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# set_up_docker_git()
git --version: git version 2.43.0
/app
# set_pythonpath()
Adding /app to PYTHONPATH
PYTHONPATH=/app:
# Configure env
# set_up_docker_in_docker()
[sudo] password for ubuntu:
[sudo] password for ubuntu: sudo: a password is required
Could it be because it uses docker image causify/helpers:dev, instead of causify/helpers:prod since I'm working on the helpers repo.
causify/helpers:dev is correct because you're not running linter. Wierd is thing is that it should not ask for pw.
Also not sure why this is the same?
##> devops/docker_run/entrypoint.sh
UID=1000
GID=1000
can u try?
getent group 1000
getent group 1000 output:
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ getent group 1000
alvinoangelo:x:1000:
Oh wait i think the dind is still not set up from your above log. There should be this set_up_docker_in_docker step in the log. Can you trace back to variable that caused this part to skip? Is CSFY_ENABLE_DIND set to 1 now?
Example log when it is setup.
# set_up_docker_git()
git --version: git version 2.43.0
/app
# set_pythonpath()
Adding /app/helpers_root to PYTHONPATH
Adding /app to PYTHONPATH
PYTHONPATH=/app:/app/helpers_root:
# Configure env
# set_up_docker_in_docker()
{ "storage-driver": "vfs" }
* Starting Docker: docker [ OK ]
* Docker is running
Waiting for /var/run/docker.sock to be created.
Permissions for /var/run/docker.sock have been changed.
Setting sudo docker permissions
srw-rw-rw- 1 root docker 0 Apr 24 18:16 /var/run/docker.sock
srw-rw-rw- 1 root docker 0 Apr 24 18:16 /var/run/docker.sock
# set_up_docker_git()
git --version: git version 2.43.0
/app
# invoke print_env
Yes, CSFY_ENABLE_DIND = 1 now.
I figured out that this line was the one making the issue.
sudo echo '{ "storage-driver": "vfs" }' | sudo tee -a /etc/docker/daemon.json
https://github.com/causify-ai/helpers/blob/32f2268843359438bc9adcdd1124f4e05ab019b1/dev_scripts_helpers/thin_client/thin_client_utils.sh#L325-L367
What happen when u were running that command on your host? does it ask for pw?
sudo echo '{ "storage-driver": "vfs" }' | sudo tee -a /etc/docker/daemon.json
No it does not ask for password when run on host.
It also mounted successfully.
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ cat /etc/docker/daemon.json
{ "storage-driver": "vfs" }
I see. ATM i'm not sure what's causing it yet. I have been using in on my mac and Linux on the server with no issue. I'll try to run in on my Ubuntu running on VM software and see if I can reproduce it.
Problem
Wierd is thing is that it should not ask for pw.
I have run it on an Ubuntu running on VM software and had that same issue.
- The problem is that the first user from the VM uses the same user id of
1000as theubuntuuser in the docker container - We have added the
user_1000(with user id1000) to the etc_sudoers but userubuntuis not - As checked,
user_1000with id1000was not even created (probably because the id is already occupied by theubuntuuser)
root@6934d3ed9bda:/app# getent passwd
ubuntu:x:1000:1000:Ubuntu:/home/ubuntu:/bin/bash
...
user_501:x:501:1001::/home:/bin/sh
user_1001:x:1001:1002::/home:/bin/sh
user_1002:x:1002:1003::/home:/bin/sh
...
- So when the container starts, this user id
1000corresponding to theubuntu(which is not theetc_sudoersfile) is used
IMAGE=causify/helpers:dev \
docker compose \
--file /home/alvinoangelo/src/helpers1/devops/compose/tmp.docker-compose.yml \
...
--user $(id -u):$(id -g)
- That's why it keeps asking for pw when u shouldn't
- I guess we haven't had this issue on our dev server before because our users have user id of > 1000
Testing
@aangelo9
Just to test, you can try hard coding this value to 1001, and the invoke bash should work. (Please keep the is_external_linux condition in the enable_privileged_mode() that u did above so that the CSFY_ENABLE_DIND is set to 1)
if as_user:
docker_cmd_.append(
r"""
--user 1001:$(id -g)"""
)
- https://github.com/causify-ai/helpers/blob/5ff3dc086faa689a2635494f906a7f86e1f100e4/helpers/lib_tasks_docker.py#L1261C1-L1265C10
Solution
- We can add the
ubuntuto the no pwd check list in the etc_sudoers file and rebuild the image (since this is probably not gonna change as we're using the ubuntu base image)
# Linux users.
ubuntu ALL=(ALL) NOPASSWD:ALL
user_1000 ALL=(ALL) NOPASSWD:ALL
- I created another user and got the user id of
1001and things work. (A bit inconvenience if everyone has to do it but i did it just to test it out) - We can also change the user id to a different number (i.e. 1001) and transfer ownership of all the files to the new user id (although it can be a bit destructive if not done carefully)
WDYT? @gpsaggese @sonniki
I ran into a ulimit error when testing:
(client_venv.helpers) alvinoangelo@alvinoangelo:~/src/helpers1$ i docker_bash
21:24:03 - INFO hdbg.py init_logger:1018 > cmd='/home/alvinoangelo/src/venv/client_venv.helpers/bin/invoke docker_bash'
# docker_bash: base_image='', stage='dev', version='', use_entrypoint=True, as_user=True, generate_docker_compose_file=True, container_dir_name='.', skip_pull=False, skip_docker_image_compatibility_check=False
# docker_pull: stage='dev', version=None, skip_pull=False
# docker_login: target_registry='aws_ecr.ck'
21:24:03 - WARN lib_tasks_docker.py docker_login:405 Skipping Docker login process for Helpers or Tutorials
21:24:03 - INFO lib_tasks_docker.py _docker_pull:230 image='causify/helpers:dev'
docker pull causify/helpers:dev
dev: Pulling from causify/helpers
Digest: sha256:43ac049013f992d7efc4a8196bfa15dc0b3f7559e52848adf825c3c7b5c84ca3
Status: Image is up to date for causify/helpers:dev
docker.io/causify/helpers:dev
IMAGE=causify/helpers:dev \
docker compose \
--file /home/alvinoangelo/src/helpers1/devops/compose/tmp.docker-compose.yml \
--env-file devops/env/default.env \
run \
--rm \
--name alvinoangelo.helpers.app.helpers1.20250424_212403 \
--user 1001:$1001 \
app \
bash
WARN[0000] The "CSFY_FORCE_TEST_FAIL" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_ACCESS_KEY_ID" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_DEFAULT_REGION" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_SECRET_ACCESS_KEY" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_AWS_SESSION_TOKEN" variable is not set. Defaulting to a blank string.
WARN[0000] The "CSFY_TELEGRAM_TOKEN" variable is not set. Defaulting to a blank string.
WARN[0000] The "OPENAI_API_KEY" variable is not set. Defaulting to a blank string.
WARN[0000] /home/alvinoangelo/src/helpers1/devops/compose/tmp.docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion
##> devops/docker_run/entrypoint.sh
UID=1001
GID=1
CSFY_USE_HELPERS_AS_NESTED_MODULE=0
CSFY_HOST_GIT_ROOT_PATH=/home/alvinoangelo/src/helpers1
CSFY_GIT_ROOT_PATH=/app
CSFY_HELPERS_ROOT_PATH=/app
> source /app/dev_scripts_helpers/thin_client/thin_client_utils.sh ...
AM_CONTAINER_VERSION='1.2.0'
CSFY_USE_HELPERS_AS_NESTED_MODULE=0
##> devops/docker_run/docker_setenv.sh
> source /app/dev_scripts_helpers/thin_client/thin_client_utils.sh ...
# activate_docker_venv()
# set_path()
PATH=.:./.github:./devops:./helpers:./.vscode:./.git:./papers:./dev_scripts_helpers:./.mypy_cache:./config_root:./docs:./import_check:./linters::/app:/venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# set_up_docker_git()
git --version: git version 2.43.0
/app
# set_pythonpath()
Adding /app to PYTHONPATH
PYTHONPATH=/app:
# Configure env
# set_up_docker_in_docker()
{ "storage-driver": "vfs" }
/etc/init.d/docker: 69: ulimit: error setting limit (Operation not permitted)
It also gives the same error when I do:
if as_user:
docker_cmd_.append(
r"""
--user 1001:1001"""
)
Hmm that's weird but at least it doesn't ask for pwd this time. You can try to delete the image on your local and rerun maybe.
Also it look likes our colleague had similar issue and added a fix here https://github.com/causify-ai/helpers/blob/5ff3dc086faa689a2635494f906a7f86e1f100e4/dev_scripts_helpers/thin_client/thin_client_utils.sh#L334-L338
/etc/init.d/docker: 69: ulimit: error setting limit (Operation not permitted)
You can bash into the container also and find what that line is. See if you can change the ulimit and start the docker manually from inside the container.
docker run -it --user 1001:1001 --entrypoint bash causify/helpers:dev
Feel free to do some debugging on your setup (in case there's edge case that we don't know). I did a quick search and this ulimit error is common in DinD setup.
My fix was to comment out all ulimit including the if block with it in etc/init.d/docker.
https://github.com/causify-ai/helpers/blob/5ff3dc086faa689a2635494f906a7f86e1f100e4/dev_scripts_helpers/thin_client/thin_client_utils.sh#L334-L338
# Comments out ulimit -Hn 524288.
sudo sed -i 's/ulimit -Hn/# ulimit -Hn/g' /etc/init.d/docker
# Comments out `if` block.
sudo sed -i '/if \[ "\$BASH" \]; then/,/fi/ s/^/#/' /etc/init.d/docker
# Only set the hard limit (soft limit should remain as the system default of 1024):
# ulimit -Hn 524288
# Having non-zero limits causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
# if [ "$BASH" ]; then
# ulimit -u unlimited
# else
# ulimit -p unlimited
# fi
I was able to get inside the container with DinD working. However, ran into a read-write permission error when running invoke lint.
PermissionError: [Errno 13] Permission denied: 'tmp.amp_normalize_import.txt'
This is probably due to --user 1001:1001 having no permissions on host and would be fixed when the original ubuntu user is used.
-
@aangelo9 If you created files with the "wrong" user, then only your root can delete them
-
Adding the user to
etc_sudoersis the right approach
Linux users.
ubuntu ALL=(ALL) NOPASSWD:ALL user_1000 ALL=(ALL) NOPASSWD:ALL
- @heanhsok any "easy" way to repro this on one of our systems? On one side, having a way to reproduce it ourselves and fix it is best (at least on your laptop if you can see the error). Just thinking around these nightmare of debugging on other systems. Unless you think this was one-and-done
any "easy" way to repro this on one of our systems?
I don't think we can have this exact setup on our dev server because the VM is running from Window host.
Although, in theory, I think we should try to make it that things work the same way whether it's Linux running on VM with Window host or Mac host, Linux on CI, or Linux on dev server.
- I was able to reproduce the first error (docker not started) by running a Linux VM on Mac and fixed it by adding the user to the
etc_sudoers - I am still unable to reproduce the second issue (
ulimit error) though
Comments out
ifblock. sudo sed -i '/if [ "$BASH" ]; then/,/fi/ s/^/#/' /etc/init.d/docker
@aangelo9 The concern I have with this fix is that we're not sure if it will cause side effects to other parts of our dev systems as we're modifying the Docker service management script. It would be good if we could find a reference to an online thread discussing this issue and the fixes similar to "TODO(Vlad): Fix ulimit error: https://github.com/docker/cli/issues/4807."
Also if would be good if we can test this on another machine with the same setup (another interns' machine maybe?) so we can be sure that it's not machine specific issue (e.g. VM software version, Linux version, machine type,...etc)? WDYT? @sonniki