xcp
                                
                                 xcp copied to clipboard
                                
                                    xcp copied to clipboard
                            
                            
                            
                        Failed installation on top of previous soft RAID Linux install
Hello !
I regularly reinstall machine which previously run Linux (Debian 10 mostly) with soft raid 1 to XCP-ng. Since 8.2 (And I think it's older than that), when I recreate the soft raid 1 from the installer, the installer finishes correctly, but I end up in grub rescue at the reboot. My guess is that XCP-ng installer don't delete the old soft raid correctly, and the grub get confused. I try to boot via grub rescue, but with no success. My workaround is to boot a live Debian, launch the shell, and execute this for each disk I want to use in my soft raid:
DISK=sdx
LBAS=$(cat /sys/block/$DISK/size)
dd if=/dev/zero of=/dev/$DISK bs=512 count=1024
dd if=/dev/zero of=/dev/$DISK bs=512 seek=$(($LBAS-1024)) count=1024
mdadm --zero-superblock /dev/$DISK
sync
Then I can relaunch the installer, and XCP-ng install successfully !
Let me know if I can help !
Cécile
Thanks for the report. It was known that creating a soft RAID may fail on previously used disks due to stale metadata, but not that it may succeed and then fail only at grub install stage.
Do you see what the error is in the installer logs (/tmp/install-log from the installer before rebooting, or /var/log/installer/install-log from the installed system that doesn't boot)?
Related to https://github.com/xcp-ng/xcp/issues/107
Hello !
Unfortunately, I didn't keep the logs, since I need the machine urgently. I will try to set up a test machine to reproduce this bug ASAP :)
Hello !
I found the time to test the installation of XCP-ng on top of a Debian (11.3) soft raid 1 I tried the process on one of my lab machine (Dell R610 with 2 146G HDD) and a VM with two 80G disk. I have the same behavior on both machines. I attach the log of the VM one. installer.log
I did this to test raid soft:
- Install a Debian soft raid1, with 1 / ext4 partition
- Boot the Debian and make sure grub and raid work well
- Then install XCP-ng (8.2.1)
- On the disk selector, i have this: xvda xvdb md0 I tried to recreate the raid with the software raid panel, but the md0 raid was again present and i don't have a md127 as usual. So I selected md0, and continue my install (nothing special, I just select ext instead of lvm) As expected, once the installation is finish, when the server reboot, I arrive at the grub rescue.
I think the best workaround is to allow on the installer to delete old soft raid, using the command I provided in my first message.
I can provide you access to my lab machine and/or the VM I use to test, if you want to dig directly.
Thanks again for looking into that, and sorry for the delay !