sonic-buildimage icon indicating copy to clipboard operation
sonic-buildimage copied to clipboard

Set owner and group for redis_chassis to "redis" in database-chassis

Open VenkatCisco opened this issue 1 year ago • 2 comments

Why I did it

With 202311 (and master) branch, when database-chassis docker container is brought up by SONiC at boot-time, the 'redis' server running within container is spawned as user 'root'. It accordingly opens the following two paths with the root user privileges. /var/lib/redis_chassis /var/run/redis_chassis

root@Leaf1:/var/lib# ls -lrt | grep redis drwxr-x--- 2 redis redis 4096 Jan 16 06:12 redis drwxr-xr-x 2 root root 4096 Feb 21 02:05 redis_chassis

the supervisor code however tries to spawn server instance with 'redis' as user.

[program:redis_chassis] command=/bin/bash -c "{ [[ -s /var/lib/redis_chassis/dump.rdb ]] || rm -f /var/lib/redis_chassis/dump.rdb; } && mkdir -p /var/lib/redis_chassis && exec /usr/bin/redis-server /etc/redis/redis.conf --bind redis_chassis.server --port 6380 --unixsocket /var/run/redis-chassis/redis_chassis.sock -- pidfile /var/run/redis/redis_chassis.pid --dir /var/lib/redis_chassis" priority=2 user=redis autostart=true autorestart=false stdout_logfile=syslog stderr_logfile=syslog

At system bring-up time, when database-chassis is instantiated, the redis_chassis.server bails out citing permission issues. The fix is hence required to ensure initialization of redis-chassis and its related databases.##### Work item tracking

  • Microsoft ADO (number only):

How I did it

made code changes to docker-database-init.sh

How to verify it

root@Leaf1:/var/run# ls -lrt total 8 -rw-rw-r-- 1 root utmp 0 Jan 10 00:00 utmp drwxrwxrwt 2 root root 4096 Jan 10 00:00 lock srwx------ 1 root root 0 Feb 21 02:05 supervisor.sock -rw-r--r-- 1 root root 2 Feb 21 02:05 supervisord.pid drwxrwxrwx 3 redis redis 100 Feb 21 02:05 redis-chassis drwxrwxrwx 3 redis redis 100 Feb 21 02:05 redis root@Leaf1:/var/run# ls -lrt /var/lib | grep redis drwxr-x--- 2 redis redis 4096 Jan 16 06:12 redis drwxr-xr-x 2 redis redis 4096 Feb 21 02:05 redis_chassis root@Leaf1:/home/cisco# docker ps | grep database 11f441c45bbd docker-database:latest "/usr/local/bin/dock…" 17 hours ago Up 17 hours database 0d0445eba5e5 docker-database:latest "/usr/local/bin/dock…" 18 hours ago Up 17 hours database-chassis root@Leaf1:/home/cisco# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 11f441c45bbd docker-database:latest "/usr/local/bin/dock…" 17 hours ago Up 17 hours database 0d0445eba5e5 docker-database:latest "/usr/local/bin/dock…" 18 hours ago Up 17 hours database-chassis 6a55972db4ef docker-orchagent:latest "/usr/bin/docker-ini…" 21 hours ago Up 17 hours swss 1b655c70b1a0 docker-snmp:latest "/usr/local/bin/supe…" 5 weeks ago Up 17 hours snmp d8dd80563ec4 docker-platform-monitor:latest "/usr/bin/docker_ini…" 5 weeks ago Up 17 hours pmon 8c5a30744ddb docker-sonic-mgmt-framework:latest "/usr/local/bin/supe…" 5 weeks ago Up 17 hours mgmt-framework c54ef5f687f7 docker-lldp:latest "/usr/bin/docker-lld…" 5 weeks ago Up 17 hours lldp 4dcb2b637eab docker-sonic-gnmi:latest "/usr/local/bin/supe…" 5 weeks ago Up 17 hours gnmi 4bf983355fcb 41594494637b "/usr/bin/docker_ini…" 5 weeks ago Up 17 hours dhcp_relay bc19ff1747d7 docker-router-advertiser:latest "/usr/bin/docker-ini…" 5 weeks ago Up 17 hours radv 9a847da60ae8 docker-syncd-cisco:latest "/usr/local/bin/supe…" 5 weeks ago Up 17 hours syncd 9b69a4e9f379 docker-fpm-frr:latest "/usr/bin/docker_ini…" 5 weeks ago Up 17 hours bgp 58093aca5d1a docker-teamd:latest "/usr/local/bin/supe…" 5 weeks ago Up 17 hours teamd 933250dd3518 docker-eventd:latest "/usr/local/bin/supe…" 5 weeks ago Up 17 hours eventd root@Leaf1:/home/cisco# show chassis system-neighbors System Port Interface Neighbor MAC Encap Index

Leaf1|Asic0|Ethernet-IB0 4.4.4.4 f8:e5:7e:7f:6c:00 1104 root@Leaf1:/home/cisco# docker exec -it database-chassis bash root@Leaf1:/# ps ax PID TTY STAT TIME COMMAND 1 pts/0 Ss+ 0:14 /usr/bin/python3 /usr/local/bin/supervisord 43 pts/0 Sl 0:03 python3 /usr/bin/supervisor-proc-exit-listener --container-name database 44 pts/0 Sl 1:20 /usr/bin/redis-server redis_chassis.server:6380 65 pts/0 Sl 0:00 /usr/sbin/rsyslogd -n -iNONE 136 pts/1 Ss+ 0:00 bash 144 pts/2 Ss 0:00 bash 152 pts/2 S+ 0:00 more /usr/local/bin/docker-database-init.sh 178 pts/3 Ss 0:00 bash 184 pts/3 R+ 0:00 ps ax

Which release branch to backport (provide reason below if selected)

  • [ ] 201811
  • [ ] 201911
  • [ ] 202006
  • [ ] 202012
  • [ ] 202106
  • [ ] 202111
  • [ ] 202205
  • [ ] 202211
  • [ ] 202305

Tested branch (Please provide the tested image version)

  • [x] 202311
  • [ ]

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

VenkatCisco avatar Mar 22 '24 00:03 VenkatCisco

i am assuming for normal redis db, it also needs to create these two directories, why that is not subject to this owner/group issue?

lguohan avatar Mar 30 '24 07:03 lguohan

i am assuming for normal redis db, it also needs to create these two directories, why that is not subject to this owner/group issue?

This is required to address the inconsistency in the privileges that we are running into when SONiC is spawned in a disaggregated environment (PR for the feature is raised @: https://github.com/VenkatCisco/SONiC/blob/venkatg/voq_hld_rev2/doc/voq/architecture.md

To summarize: SONiC on DSF, will leverage database-chassis and runs this docker on each leaf device (instead of having each router pointing to a central database-chassis). Each router will have the redis-chassis ip address pointing to its own loopback interface (127.0.0.1).

Without this fix:

  1. In database-chassis: root@r2-rp:/var/run# ls -lrt drwxr-xr-x 3 root root 100 Apr 2 09:29 redis-chassis drwxr-xr-x 3 root root 100 Apr 2 09:29 redis root@r2-rp:/var/run# cd ../lib root@r2-rp:/var/lib# ls -lrt drwxr-x--- 2 redis redis 4096 Sep 7 2023 redis drwxr-xr-x 2 root root 4096 Dec 2 00:16 redis_chassis

  2. In database: root@r2-rp:/var/run# ls -lrt drwxr-xr-x 3 root root 100 Apr 2 09:29 redis-chassis drwxr-xr-x 3 root root 100 Apr 2 09:29 Redis root@r2-rp:/var/run# cd ../lib root@r2-rp:/var/lib# ls -lrt drwxr-xr-x 2 root root 4096 Dec 2 00:16 redis_chassis drwxr-x--- 1 redis redis 4096 Apr 2 09:29 Redis

With this fix:

  1. In database-chassis container: root@Leaf0:/var/run# ls -lrt drwxr-xr-x 3 redis redis 100 Apr 12 00:03 redis-chassis drwxr-xr-x 3 redis redis 100 Apr 12 00:03 redis root@Leaf0:/var/run# cd ../lib root@Leaf0:/var/lib# ls -lrt drwxr-x--- 2 redis redis 4096 Apr 7 15:38 redis drwxr-xr-x 2 redis redis 4096 Apr 11 16:53 redis_chassis

  2. In database container: root@Leaf0:/var/run# ls -lrt redis srwxrw---- 1 redis 1001 0 Apr 12 00:03 redis.sock -rw-r--r-- 1 redis redis 3 Apr 12 00:03 redis.pid root@Leaf0:/var/lib# cd ../lib root@Leaf0:/var/lib# ls -lrt drwxr-xr-x 2 root root 4096 Apr 11 16:54 redis_chassis drwxr-x--- 1 redis redis 4096 Apr 12 00:03 redis

VenkatCisco avatar Apr 15 '24 23:04 VenkatCisco

i am assuming for normal redis db, it also needs to create these two directories, why that is not subject to this owner/group issue?

This is required to address the inconsistency in the privileges that we are running into when SONiC is spawned in a disaggregated environment (PR for the feature is raised @: https://github.com/VenkatCisco/SONiC/blob/venkatg/voq_hld_rev2/doc/voq/architecture.md

To summarize: SONiC on DSF, will leverage database-chassis and runs this docker on each leaf device (instead of having each router pointing to a central database-chassis). Each router will have the redis-chassis ip address pointing to its own loopback interface (127.0.0.1).

Without this fix:

  1. In database-chassis: root@r2-rp:/var/run# ls -lrt drwxr-xr-x 3 root root 100 Apr 2 09:29 redis-chassis drwxr-xr-x 3 root root 100 Apr 2 09:29 redis root@r2-rp:/var/run# cd ../lib root@r2-rp:/var/lib# ls -lrt drwxr-x--- 2 redis redis 4096 Sep 7 2023 redis drwxr-xr-x 2 root root 4096 Dec 2 00:16 redis_chassis
  2. In database: root@r2-rp:/var/run# ls -lrt drwxr-xr-x 3 root root 100 Apr 2 09:29 redis-chassis drwxr-xr-x 3 root root 100 Apr 2 09:29 Redis root@r2-rp:/var/run# cd ../lib root@r2-rp:/var/lib# ls -lrt drwxr-xr-x 2 root root 4096 Dec 2 00:16 redis_chassis drwxr-x--- 1 redis redis 4096 Apr 2 09:29 Redis

With this fix:

  1. In database-chassis container: root@Leaf0:/var/run# ls -lrt drwxr-xr-x 3 redis redis 100 Apr 12 00:03 redis-chassis drwxr-xr-x 3 redis redis 100 Apr 12 00:03 redis root@Leaf0:/var/run# cd ../lib root@Leaf0:/var/lib# ls -lrt drwxr-x--- 2 redis redis 4096 Apr 7 15:38 redis drwxr-xr-x 2 redis redis 4096 Apr 11 16:53 redis_chassis
  2. In database container: root@Leaf0:/var/run# ls -lrt redis srwxrw---- 1 redis 1001 0 Apr 12 00:03 redis.sock -rw-r--r-- 1 redis redis 3 Apr 12 00:03 redis.pid root@Leaf0:/var/lib# cd ../lib root@Leaf0:/var/lib# ls -lrt drwxr-xr-x 2 root root 4096 Apr 11 16:54 redis_chassis drwxr-x--- 1 redis redis 4096 Apr 12 00:03 redis

@lguohan, when you get chance, can you review/comment or approve this PR ?

VenkatCisco avatar Aug 03 '24 17:08 VenkatCisco