nimbus-eth2 icon indicating copy to clipboard operation
nimbus-eth2 copied to clipboard

Error writting to data dir volume

Open cbermudez97 opened this issue 3 years ago • 4 comments

Describe the bug Im running a beacon node using docker compose while persisting the node data to a volume. When started the node fail after some errors about permissions for the data dir I used. First my data folder has 755 as permissions. When starting the node it fails with these:

cl-beacon-nimbus  | [Chronicles] Log message not delivered: [Chronicles] A writer was not configured for a dynamic log output device. Log message not delivered: WRN 2022-06-03 15:33:16.602+00:00 Data directory has insecure permissions. Correcting them. data_dir=/data current_permissions=0755 required_permissions=0700

Following the instructions there I changed my data folder permissions to 700. Running the node again shows these then:

cl-beacon-nimbus  | /home/user/nimbus-eth2/vendor/nim-libp2p/libp2p/stream/bufferstream.nim(438) NimMain
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(2069) main
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1937) handleStartUpCmd
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1690) doRunBeaconNode
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1503) createPidFile
cl-beacon-nimbus  | /home/user/nimbus-eth2/vendor/nimbus-build-system/vendor/Nim/lib/system/io.nim(706) writeFile
cl-beacon-nimbus  | Error: unhandled exception: cannot open: /data/beacon_node.pid [IOError]
cl-beacon-nimbus exited with code 1

To Reproduce

  1. Create docker-compose.yml with:
version: '3.9'
services:
  beacon:
    stop_grace_period: 30s
    container_name: cl-beacon-nimbus
    restart: unless-stopped
    image: statusim/nimbus-eth2:amd64-latest
    volumes:
      - ./data:/data         
      - ./jwtsecret:/tmp/jwt/jwtsecret
    ports:
      - 9000:9000/tcp
      - 9000:9000/udp
      - 5054:5054/tcp
    expose:
      - 5051
    command:
      - --network=merge-testnets/kiln
      - --data-dir=/data
      - --tcp-port=9000
      - --udp-port=9000
      - --web3-url=ws://127.0.0.1:8151
      - --max-peers=50
      - --rest
      - --rest-address=0.0.0.0
      - --rest-port=5051
      - --rest-allow-origin=*
      - --metrics
      - --metrics-address=0.0.0.0
      - --metrics-port=5054
      - --jwt-secret="/tmp/jwt/jwtsecret"
      - --terminal-total-difficulty-override=100000000000000000000000
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  1. Run openssl rand -hex 32 > jwtsecret
  2. Create ./data and change its permissions to 755
  3. Run docker compose up beacon
  4. Change ./data permissions to 700
  5. Run docker compose up beacon

Additional All the above commands were used as root.

cbermudez97 avatar Jun 03 '22 15:06 cbermudez97

I can reproduce this issue. More docs are needed about running in docker.

mugiwara-pirate avatar Jul 04 '23 09:07 mugiwara-pirate

Besides having the right permissions, the data dir should be owned by the same used ID that is used by the docker process. The correct usage of user IDs within docker is a somewhat complicated topic which is explored in the following article:

https://medium.com/@mccode/understanding-how-uid-and-gid-work-in-docker-containers-c37a01d01cf

zah avatar Jul 04 '23 10:07 zah

@zah Thanks for the link. Indeed, executing the following command makes the error goes away in my case:

sudo chown 1000:1000 -R <MY_HOST_DATA_DIR>

But perhaps some updates can be done so that such a manual configuration is not necessary for users.

mugiwara-pirate avatar Jul 05 '23 05:07 mugiwara-pirate

Hit by this as well. But I'm trying to run Nimbus rootless in Podman (https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md)

I have a suspicion that having the binary built as user as in there is the cause of those woes: https://github.com/status-im/nimbus-eth2/blob/7c731a2bfb4820ef5c08e5e35df635b986ed4857/docker/dist/Dockerfile.amd64#L6C1-L19

Instead of putting the binaries in /home/user they likely can be put in /usr/local/bin, and we can remove a dependency on this user user.

Some references:

  • https://faun.pub/set-current-host-user-for-docker-container-4e521cef9ffc
  • https://stackoverflow.com/questions/64857370/using-current-user-when-running-container-in-docker-compose
  • https://jtreminio.com/blog/running-docker-containers-as-current-host-user/

Podman CLI commands

requires mapping a volume to /home/node:

podman pod create \
    --name taiko-a6-katla \
    --volume $HOME/pod-data/taiko-a6-katla:/home/node

Lighthouse (working)

podman run -dt \
  --pod taiko-a6-katla \
  --name tko-a6-l1-cl-lighthouse \
    docker.io/sigp/lighthouse:latest-modern \
      lighthouse bn \
        --datadir /home/node/l1-cl/lighthouse \
        --network holesky \
        --execution-endpoint http://localhost:8551 \
        --execution-jwt /home/node/jwtsecret \
        --http \
        --http-address 0.0.0.0 \
        --metrics \
        --metrics-address 0.0.0.0 \
        --checkpoint-sync-url https://checkpoint-sync.holesky.ethpandaops.io

Nimbus

podman run -dt \
  --pod taiko-a6-katla \
  --name tko-a6-l1-cl-nimbus-checkpoint-sync \
    docker.io/statusim/nimbus-eth2:amd64-latest \
        trustedNodeSync \
        --data-dir=/home/node/l1-cl/nimbus/beacon_node \
        --network=holesky \
        --non-interactive \
        --web3-url=http://localhost:8551 \
        --with-deposit-snapshot \
        --backfill=false \
        --trusted-node-url=http://testing.holesky.beacon-api.nimbus.team

mratsim avatar Jan 25 '24 11:01 mratsim