node-cli icon indicating copy to clipboard operation
node-cli copied to clipboard

Command [vgremove schains -y] failed, error: Logical volume schains/shared-space in use.

Open kucharskim opened this issue 1 year ago • 6 comments

I am upgrading CONTAINER_CONFIGS_STREAM=2.1.16 with skale node-cli 2.3.0 and I consistently fails on all my nodes getting following error:

Cleaning failed with error: Command [vgremove schains -y] failed, error:   Logical volume schains/shared-space in use.

kucharskim avatar May 10 '23 19:05 kucharskim

# /root/.env
ENV_TYPE=mainnet
DISK_MOUNTPOINT=/dev/dm-0
ENDPOINT=http://XXX:8545
IMA_ENDPOINT=http://XXX:8545
MONITORING_CONTAINERS=True
SGX_SERVER_URL=https://XXX:1026
SGX_URL=$SGX_SERVER_URL
SKALE_NODE_CLI_VERSION=1.2.1

DOCKER_LVMPY_STREAM=1.0.2-stable.0
CONTAINER_CONFIGS_STREAM=2.1.16
FILEBEAT_HOST=filebeat.mainnet.skalenodes.com:5000
DISABLE_IMA=False
IMA_CONTRACTS_ABI_URL=https://raw.githubusercontent.com/skalenetwork/skale-network/master/releases/mainnet/IMA/1.3.2/mainnet/abi.json
MANAGER_CONTRACTS_ABI_URL=https://raw.githubusercontent.com/skalenetwork/skale-network/master/releases/mainnet/skale-manager/1.9.2/skale-manager-1.9.2-mainnet-abi.json

kucharskim avatar May 10 '23 19:05 kucharskim

root@a04:~# pvs
  PV               VG      Fmt  Attr PSize PFree
  /dev/md5         skale   lvm2 a--  2.48t    0 
  /dev/skale/skale schains lvm2 a--  2.48t 1.52t

root@a04:~# vgs
  VG      #PV #LV #SN Attr   VSize VFree
  schains   1   5   0 wz--n- 2.48t 1.52t
  skale     1   1   0 wz--n- 2.48t    0 

root@a04:~# lvs
  LV                        VG      Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  adorable-quaint-bellatrix schains -wi-ao---- <196.42g                                                    
  curly-red-alterf          schains -wi-ao---- <196.42g                                                    
  haunting-devoted-deneb    schains -wi-ao---- <196.42g                                                    
  portly-passionate-sirius  schains -wi-ao---- <196.42g                                                    
  shared-space              schains -wi-ao---- <198.19g                                                    
  skale                     skale   -wi-ao----    2.48t

kucharskim avatar May 10 '23 19:05 kucharskim

Commands are from various machines, but problem is exactly the same on all of them.

kucharskim avatar May 10 '23 19:05 kucharskim

root@c03:~# vgremove schains -y
  Logical volume schains/shared-space in use.

root@c03:~# lvs
  LV                       VG      Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  elated-tan-skat          schains -wi-a----- <196.42g                                                    
  gargantuan-wealthy-zosma schains -wi-a----- <196.42g                                                    
  light-vast-diphda        schains -wi-a----- <196.42g                                                    
  plain-rotanev            schains -wi-a----- <196.42g                                                    
  shared-space             schains -wi-ao---- <198.19g                                                    
  skale                    skale   -wi-ao----    2.48t

root@c03:~# grep -F shared-space /proc/mounts 
/dev/mapper/schains-shared--space /mnt/schains-shared-space btrfs rw,relatime,ssd,space_cache,subvolid=5,subvol=/ 0 0

root@c03:~# fuser -m /mnt/schains-shared-space 2>/dev/null | wc -l
0

root@c03:~# find /mnt/schains-shared-space -ls
      256     16 drwxr-xr-x   1 root     root            8 Apr 13 08:46 /mnt/schains-shared-space
     1756      0 drwxr-xr-x   1 root     root            0 Apr 13 08:46 /mnt/schains-shared-space/data

Solution is to by hand umount problematic lvm device:

root@c03:~# umount /mnt/schains-shared-space
root@c03:~# echo $?
0

kucharskim avatar May 10 '23 19:05 kucharskim

root@c03:~# vgremove schains -y
  Logical volume "shared-space" successfully removed
  Logical volume "plain-rotanev" successfully removed
  Logical volume "elated-tan-skat" successfully removed
  Logical volume "gargantuan-wealthy-zosma" successfully removed
  Logical volume "light-vast-diphda" successfully removed
  Volume group "schains" successfully removed

kucharskim avatar May 10 '23 19:05 kucharskim

Side comment, node cli doesn't respect error exit codes:

# skale node update --yes /root/.env
...
  File "socket.py", line 706, in readinto
  File "ssl.py", line 1278, in recv_into
  File "ssl.py", line 1134, in read
TimeoutError: The read operation timed out

# echo $?
0

kucharskim avatar May 10 '23 19:05 kucharskim