hwraid icon indicating copy to clipboard operation
hwraid copied to clipboard

Missing packages - Debian 12 (bookworm)

Open Moal68 opened this issue 1 year ago • 10 comments

Cannot upgrade to Debian12 because packeages are missing.

Moal68 avatar Sep 06 '23 09:09 Moal68

I would love to see Bookworm added.


I was able to upgrade form Debian 11 to Debian 12 just fine. ( You do this at your own risk )

grep '^Package:' /var/lib/apt/lists/hwraid.le-vert.net_debian_dists_bullseye_main_binary-amd64_Packages
Package: megacli
Package: megaclisas-status

This tells me which packages from this repo I currently have. So I can find details for each of those packages...

dpkg -s megacli
Package: megacli
Status: install ok installed
Priority: optional
Section: admin
Installed-Size: 6695
Maintainer: Adam Cécile (Le_Vert) <[email protected]>
Architecture: amd64
Source: megacli (8.07.14-3)
Version: 8.07.14-3+Debian.11.bullseye
Replaces: megaclisas
Provides: megaclisas
Depends: libc6 (>= 2.3), libgcc-s1 (>= 3.0), libncurses5 (>= 6), libstdc++6 (>= 4.1.1)
Conflicts: megaclisas
Description: LSI Logic MegaRAID SAS MegaCLI
 Tool to read and setup LSI Logic MegaRAID SAS HW RAID HBAs.
Homepage: http://www.lsi.com/support/Pages/Download-Results.aspx?keyword=megacli

Here, there are only a few dependencies, and I can check that each of these are available within the Bookworm repository. One thing to watch out for is if the newer release (Bookworm) deprecates a commonly used GCC/libc version, which doesn't seem to be the case yet.

Again, I am not recommending you take this route.

daryltucker avatar Sep 09 '23 00:09 daryltucker

I'm not quite sure what's going on with Bookworm support, but if I have my raid card (megasas 2008 falcon) set to JBOD, would Linux recognize the drives directly without having to have megaraid management software installed?

Doing some research and it may be possible to flash my raid card with an IT firmware (pass through) so I can just use it as a plain drive controller for software raid and the like, something I wasn't aware was an option. It is apparently a little faster then just JBOD mode, but risky as if you flash the card wrong you can brick it. I might do that if Bookworm isn't eventually supported.

Betonhaus avatar Nov 23 '23 08:11 Betonhaus

Hi, you may use the bullseye repository on a bookworm machine without (almost) any problem.

These tools are not required to access the disks, but if you let the controller manage a logical disk, then these tools are the only way to get notified if any disk brake.

eppesuig avatar Jan 13 '24 14:01 eppesuig

I'm not sure how to make the bullseye repository work under bookworm but i'll look into it.

Would it make sense to just flash the raid card to IT mode so that it presents itself to the computer as a basic drive controller, then no software at all is needed to manage it?

Betonhaus avatar Jan 15 '24 22:01 Betonhaus

Hello, if the report on megacli doing segfaults on certain commands are correct (have no reason to doubt) then it might be on broadcom to deliver an updated version as the original binary is basically repackaged using alien for .deb packaging from the original rpm package. It might also be time to use the storcli command for newer kernels and (painfully) accept Broadcom’s decision that this is the cli they root for. The super easy storage monitor like megaclisas-statusd will need a rewrite though to get a monitoring daemon again…

on the bright side Broadcom offers an Ubuntu package of storcli.

regards, hk

hknet avatar Jan 16 '24 02:01 hknet

In order to use bullseye repository just change the release name in the apt souces file where you wrote the address of hwraid.le-vert.net repopsitory.

eppesuig avatar Jan 16 '24 07:01 eppesuig

If you have your raid card set to jbod and have no intention of changing it, could you do an upgrade to Debian 12 and still use the megaclisas-status to lookup if the drives are connected and good?

Betonhaus avatar Mar 07 '24 00:03 Betonhaus

@Betonhaus in my case the controller is not in JBOD. I think it have the IR firmware. As you see, on Debian 12 there is segmentation fault on the controller information:

oot@mantide:~# megaclisas-status 
-- Controller information --
Segmentation fault
-- ID | H/W Model                | RAM    | Temp | BBU    | Firmware     
c0    | LSI MegaRAID SAS 9285-8e | 1024MB | 255C | Good   | FW: 23.1.1-0017 

-- Array information --
-- ID | Type   |    Size |  Strpsz | Flags | DskCache |   Status |  OS Path | CacheCade |InProgress   
c0u0  | RAID-5 |  18190G |   64 KB | RA,WB | Disabled |  Optimal | /dev/sdb | None      |None         

-- Disk information --
-- ID   | Type | Drive Model                              | Size     | Status          | Speed    | Temp | Slot ID  | LSI ID  
c0u0p0  | HDD  | V1GK5YUG WDC WD4003FRYZ-01F0DB0 01.01H01 | 3.637 TB | Online, Spun Up | 6.0Gb/s  | 41C  | [252:0]  | 24      
c0u0p1  | HDD  | V1GJG4KB WDC WD4003FRYZ-01F0DB0 01.01H01 | 3.637 TB | Online, Spun Up | 6.0Gb/s  | 47C  | [252:1]  | 22      
c0u0p2  | HDD  | V1GJEZKB WDC WD4003FRYZ-01F0DB0 01.01H01 | 3.637 TB | Online, Spun Up | 6.0Gb/s  | 50C  | [252:2]  | 21      
c0u0p3  | HDD  | V1GJG3ZB WDC WD4003FRYZ-01F0DB0 01.01H01 | 3.637 TB | Online, Spun Up | 6.0Gb/s  | 51C  | [252:4]  | 26      
c0u0p4  | HDD  | V1GJG4PB WDC WD4003FRYZ-01F0DB0 01.01H01 | 3.637 TB | Online, Spun Up | 6.0Gb/s  | 52C  | [252:5]  | 25      
c0u0p5  | HDD  | V1GJG02B WDC WD4003FRYZ-01F0DB0 01.01H01 | 3.637 TB | Online, Spun Up | 6.0Gb/s  | 50C  | [252:6]  | 23      

moreover, even the tools for updating the firware are not available on Deiban 12.

eppesuig avatar Mar 07 '24 09:03 eppesuig

@eppesuig does it give any further details about the segmentation fault? Is it possible the fault is related to a raid corruption that's unrelated to upgrading to debian 12?

Betonhaus avatar Mar 07 '24 16:03 Betonhaus

@Betonhaus every time I issue that command, I find two new lines in /var/log/syslog. They are:

2024-03-08T11:45:48.849263+01:00 mantide kernel: [2684412.088106] megacli.real[2446368]: segfault at 0 ip 000000000051bf72 sp 00007ffc46be0950 error 4 in megacli.real[400000+28c000] likely on CPU 0 (core 0, socket 0)
2024-03-08T11:45:48.849278+01:00 mantide kernel: [2684412.088116] Code: e9 3f 02 00 00 48 8d 55 e0 48 8b 75 f0 48 8b 7d f8 e8 64 fd ff ff 0f b6 c0 89 45 ec 80 7d ca 00 0f 84 15 02 00 00 48 8b 45 e0 <83> 38 00 0f 84 08 02 00 00 83 7d ec 00 0f 85 fe 01 00 00 48 8d 7d

Running it in debug mode I get some more details:

root@mantide:~# megaclisas-status --debug
# DEBUG (107) : Looking for MegaCli64 in PATH...
# DEBUG (96) : Looking in PATH /usr/local/sbin
# DEBUG (96) : Looking in PATH /usr/local/bin
# DEBUG (96) : Looking in PATH /usr/sbin
# DEBUG (96) : Looking in PATH /usr/bin
# DEBUG (96) : Looking in PATH /sbin
# DEBUG (96) : Looking in PATH /bin
# DEBUG (96) : Looking in PATH /opt/MegaRAID/MegaCli
# DEBUG (96) : Looking in PATH /ms/dist/hwmgmt/bin
# DEBUG (96) : Looking in PATH /opt/MegaRAID/perccli
# DEBUG (96) : Looking in PATH /opt/MegaRAID/storcli
# DEBUG (96) : Looking in PATH /opt/lsi/storcli
# DEBUG (96) : Looking in PATH /usr/sbin
# DEBUG (107) : Looking for MegaCli in PATH...
# DEBUG (96) : Looking in PATH /usr/local/sbin
# DEBUG (96) : Looking in PATH /usr/local/bin
# DEBUG (96) : Looking in PATH /usr/sbin
# DEBUG (96) : Looking in PATH /usr/bin
# DEBUG (96) : Looking in PATH /sbin
# DEBUG (96) : Looking in PATH /bin
# DEBUG (96) : Looking in PATH /opt/MegaRAID/MegaCli
# DEBUG (96) : Looking in PATH /ms/dist/hwmgmt/bin
# DEBUG (96) : Looking in PATH /opt/MegaRAID/perccli
# DEBUG (96) : Looking in PATH /opt/MegaRAID/storcli
# DEBUG (96) : Looking in PATH /opt/lsi/storcli
# DEBUG (96) : Looking in PATH /usr/sbin
# DEBUG (96) : Looking in PATH /opt/MegaRAID/MegaCli
# DEBUG (96) : Looking in PATH /ms/dist/hwmgmt/bin
# DEBUG (96) : Looking in PATH /opt/MegaRAID/perccli
# DEBUG (96) : Looking in PATH /opt/MegaRAID/storcli
# DEBUG (96) : Looking in PATH /opt/lsi/storcli
# DEBUG (96) : Looking in PATH /usr/sbin
# DEBUG (107) : Looking for megacli in PATH...
# DEBUG (96) : Looking in PATH /usr/local/sbin
# DEBUG (96) : Looking in PATH /usr/local/bin
# DEBUG (96) : Looking in PATH /usr/sbin
# DEBUG (100) : Found "megacli" at /usr/sbin/megacli
# DEBUG (110) : Will use this executable: /usr/sbin/megacli
# DEBUG (146) : Not a Cached value: /usr/sbin/megacli -adpCount -NoLog
-- Controller information --
# DEBUG (146) : Not a Cached value: /usr/sbin/megacli -AdpAllInfo -a0 -NoLog
# DEBUG (146) : Not a Cached value: /usr/sbin/megacli -AdpBbuCmd -GetBbuStatus -a0 -NoLog
Segmentation fault
-- ID | H/W Model                | RAM    | Temp | BBU    | Firmware     
c0    | LSI MegaRAID SAS 9285-8e | 1024MB | 255C | Good   | FW: 23.1.1-0017 
[...]

And, if I execute the last command at the prompt, I see the very same error:

# /usr/sbin/megacli -AdpBbuCmd -GetBbuStatus -a0 -NoLog
                                     
BBU status for Adapter: 0

BatteryType: iBBU-09
Voltage: 4076 mV
Current: 0 mA
Temperature: 23 C
Battery State: Optimal
Segmentation fault

So, probably it is related to the battery check. Running the same command in strace shows some more details:

[...]
[pid 2447557] ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x4d, 0x1, 0x194), 0x175e140) = 0
[pid 2447557] ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x4d, 0x1, 0x194), 0x175e140) = 0
[pid 2447557] write(1, "BBU status for Adapter: 0\n\n", 27BBU status for Adapter: 0

) = 27
[pid 2447557] write(1, "BatteryType: iBBU-09\n", 21BatteryType: iBBU-09
) = 21
[pid 2447557] write(1, "Voltage: 4077 mV\nCurrent: 0 mA\nT"..., 49Voltage: 4077 mV
Current: 0 mA
Temperature: 23 C
) = 49
[pid 2447557] write(1, "Battery State: Optimal\n", 23Battery State: Optimal
) = 23
[pid 2447557] ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x4d, 0x1, 0x194), 0x175e140) = 0
[pid 2447557] ioctl(3, _IOC(_IOC_READ|_IOC_WRITE, 0x4d, 0x1, 0x194), 0x175e140) = 0
[pid 2447557] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
[pid 2447557] +++ killed by SIGSEGV +++
<... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV}], 0, NULL) = 2447557
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=2447557, si_uid=0, si_status=SIGSEGV, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 2447557
write(2, "Segmentation fault\n", 19Segmentation fault
)    = 19
wait4(-1, 0x7ffc4cf9ce2c, WNOHANG, NULL) = -1 ECHILD (Nessun processo figlio)
read(10, "", 8192)                      = 0
exit_group(139)                         = ?
+++ exited with 139 +++

I am unable to get any more details or a fix.

Bye, Giuseppe

eppesuig avatar Mar 08 '24 10:03 eppesuig