mhvtl
mhvtl copied to clipboard
mhvtl volumes losing their serial number?
On our test and development systems, we use udev rules to identify the mhvtl volumes with a udev rule file like this:
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000100", SYMLINK+="abcd/mhvtl_lto-6_01"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000200", SYMLINK+="abcd/mhvtl_lto-6_02"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000300", SYMLINK+="abcd/mhvtl_lto-6_03"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000400", SYMLINK+="abcd/mhvtl_lto-6_04"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000500", SYMLINK+="abcd/mhvtl_lto-6_05"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000600", SYMLINK+="abcd/mhvtl_lto-6_06"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000700", SYMLINK+="abcd/mhvtl_lto-6_07"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000800", SYMLINK+="abcd/mhvtl_lto-6_08"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab000900", SYMLINK+="abcd/mhvtl_lto-6_09"
KERNEL=="st*", SUBSYSTEM=="scsi_tape", ENV{ID_SERIAL}=="350223344ab001000", SYMLINK+="abcd/mhvtl_lto-6_10"
where abcd = a short name for our project. After we run udevadm trigger
, we can list the tape volumes like this:
# ls -la /dev/abcd | grep mhvtl
However, I am finding that occasionally the mhvtl files listed under /dev/abcd are missing some entries, for example:
# ls -la /dev/abcd | grep mhvtl
lrwxrwxrwx 1 root root 6 Jun 25 03:23 mhvtl_lto-6_02 -> ../st8
lrwxrwxrwx 1 root root 6 Jun 25 03:22 mhvtl_lto-6_03 -> ../st4
lrwxrwxrwx 1 root root 6 Jun 25 03:23 mhvtl_lto-6_04 -> ../st6
lrwxrwxrwx 1 root root 6 Jun 25 03:22 mhvtl_lto-6_06 -> ../st5
lrwxrwxrwx 1 root root 6 Jun 25 03:22 mhvtl_lto-6_07 -> ../st2
lrwxrwxrwx 1 root root 6 Jun 25 03:22 mhvtl_lto-6_08 -> ../st3
lrwxrwxrwx 1 root root 6 Jun 25 03:23 mhvtl_lto-6_09 -> ../st1
lrwxrwxrwx 1 root root 6 Jun 25 03:23 mhvtl_lto-6_10 -> ../st0
Where are the links for mhvtl_lto-6_01 and mhvtl_lto-6_05?
# lsscsi |grep TD6
[2:0:1:0] tape IBM ULT3580-TD6 0107 /dev/st9
[2:0:2:0] tape IBM ULT3580-TD6 0107 /dev/st8
[2:0:3:0] tape IBM ULT3580-TD6 0107 /dev/st4
[2:0:4:0] tape IBM ULT3580-TD6 0107 /dev/st6
[2:0:5:0] tape IBM ULT3580-TD6 0107 /dev/st7
[2:0:6:0] tape IBM ULT3580-TD6 0107 /dev/st5
[2:0:7:0] tape IBM ULT3580-TD6 0107 /dev/st2
[2:0:8:0] tape IBM ULT3580-TD6 0107 /dev/st3
[2:0:9:0] tape IBM ULT3580-TD6 0107 /dev/st1
[2:0:10:0] tape IBM ULT3580-TD6 0107 /dev/st0
# for f in $(lsscsi |grep TD6 | awk '{print $NF}'); do
echo "Processing $f"; udevadm info --name=$f --query=all |grep "ID_SERIAL="
done
Processing /dev/st9
Processing /dev/st8
E: ID_SERIAL=350223344ab000200
Processing /dev/st4
E: ID_SERIAL=350223344ab000300
Processing /dev/st6
E: ID_SERIAL=350223344ab000400
Processing /dev/st7
Processing /dev/st5
E: ID_SERIAL=350223344ab000600
Processing /dev/st2
E: ID_SERIAL=350223344ab000700
Processing /dev/st3
E: ID_SERIAL=350223344ab000800
Processing /dev/st1
E: ID_SERIAL=350223344ab000900
Processing /dev/st0
E: ID_SERIAL=350223344ab001000
Here is the output from running udevadm info
on /dev/st9 which was missing the ID_SERIAL value:
udevadm info ---name=/dev/st9 --query=all
P: /devices/mhvtl_pseudo/adapter0/host2/target2:0:1/2:0:1:0/scsi_tape/st9
N: st9
E: DEVNAME=/dev/st9
E: DEVPATH=/devices/mhvtl_pseudo/adapter0/host2/target2:0:1/2:0:1:0/scsi_tape/st9
E: ID_BUS=scsi
E: ID_SCSI=1
E: MAJOR=9
E: MINOR=9
E: SUBSYSTEM=scsi_tape
E: USEC_INITIALIZED=1615028020641
One without the missing value for ID_SERIAL is shown:
# udevadm info --name=/dev/st8 --query=all
P: /devices/mhvtl_pseudo/adapter0/host2/target2:0:2/2:0:2:0/scsi_tape/st8
N: st8
S: hpss/mhvtl_lto-6_02
S: tape/by-id/scsi-350223344ab000200
E: DEVLINKS=/dev/hpss/mhvtl_lto-6_02 /dev/tape/by-id/scsi-350223344ab000200
E: DEVNAME=/dev/st8
E: DEVPATH=/devices/mhvtl_pseudo/adapter0/host2/target2:0:2/2:0:2:0/scsi_tape/st8
E: ID_BUS=scsi
E: ID_MODEL=ULT3580-TD6
E: ID_MODEL_ENC=ULT3580-TD6\x20\x20\x20\x20\x20
E: ID_REVISION=0107
E: ID_SCSI=1
E: ID_SCSI_SERIAL=XYZZY_A2
E: ID_SERIAL=350223344ab000200
E: ID_SERIAL_SHORT=50223344ab000200
E: ID_TYPE=tape
E: ID_VENDOR=IBM
E: ID_VENDOR_ENC=IBM\x20\x20\x20\x20\x20
E: ID_WWN=0x50223344ab000200
E: ID_WWN_WITH_EXTENSION=0x50223344ab000200
E: MAJOR=9
E: MINOR=8
E: SUBSYSTEM=scsi_tape
E: USEC_INITIALIZED=1615027810565
Our application tries to access the tape volume with one of the udev rules (mhvtl_lto-6_01 or mhvtl_lto-6_05 in the above example.) When the udev rule is lost, our application has failures.
What causes the mhvtl scsi device to lose its information such as the serial number and the model number? Is this a bug in the mhvtl code or an issue somewhere else?
I looked at the /var/log/messages file and the last entries with st9 were:
Jun 25 03:23:02 lews kernel: st 2:0:1:0: Attached scsi tape st9
Jun 25 03:23:02 lews kernel: st 2:0:1:0: st9: try direct i/o: yes (alignment 4 B)
And here are the entries just before our application reported an issue:
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: ssc_tur(): Test Unit Ready (2922802) ** : No, No tape loaded
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: completeSCSICommand(): s/n: (2922802), sz: 0, sam_status: 2 [02 3a 00]
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: CDB (2922803) (delay 805): 12 00 00 00 60 00
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: spc_inquiry(): INQUIRY ** (2922803)
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: completeSCSICommand(): OP s/n: (2922803), sz: 64, sam_status: 0
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: CDB (2922804) (delay 405): 00 00 00 00 00 00
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: return_sense(): [Key/ASC/ASCQ] [02 3a 00]
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: ssc_tur(): Test Unit Ready (2922804) ** : No, No tape loaded
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: completeSCSICommand(): s/n: (2922804), sz: 0, sam_status: 2 [02 3a 00]
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: CDB (2922805) (delay 405): 4d 00 51 00 00 00 00 00 0c 00
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: ssc_log_sense(): LOG SENSE (2922805) ** : Unknown code: 0x11
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: return_sense(): [Key/ASC/ASCQ] [05 24 00] 0xc0 0002
Jun 25 03:28:57 lews /usr/bin/vtltape[4034858]: completeSCSICommand(): s/n: (2922805), sz: 0, sam_status: 2 [05 24 00]
Thank you, Peter
I got the mhvtl volumes for mhvtl_lto-6_01 and mhvtl_lto-6_05 to come back by running the udevadm trigger
command, but why do I need to do that?
I found the following in the dmesg
output shortly before out application reported an issue:
[2022-06-25 03:22:58] scsi 2:0:10:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:22:58] st 2:0:10:0: Attached scsi tape st0
[2022-06-25 03:22:58] st 2:0:10:0: st0: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:22:58] st 2:0:10:0: Attached scsi generic sg13 type 1
[2022-06-25 03:22:58] scsi 2:0:9:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:22:58] st 2:0:9:0: Attached scsi tape st1
[2022-06-25 03:22:58] st 2:0:9:0: st1: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:22:58] st 2:0:9:0: Attached scsi generic sg14 type 1
[2022-06-25 03:22:58] scsi 2:0:7:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:22:58] st 2:0:7:0: Attached scsi tape st2
[2022-06-25 03:22:58] st 2:0:7:0: st2: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:22:58] st 2:0:7:0: Attached scsi generic sg15 type 1
[2022-06-25 03:22:58] scsi 2:0:8:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:22:58] st 2:0:8:0: Attached scsi tape st3
[2022-06-25 03:22:58] st 2:0:8:0: st3: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:22:58] st 2:0:8:0: Attached scsi generic sg16 type 1
[2022-06-25 03:22:58] scsi 2:0:3:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:22:59] st 2:0:3:0: Attached scsi tape st4
[2022-06-25 03:22:59] st 2:0:3:0: st4: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:22:59] st 2:0:3:0: Attached scsi generic sg17 type 1
[2022-06-25 03:22:59] scsi 2:0:6:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:22:59] st 2:0:6:0: Attached scsi tape st5
[2022-06-25 03:22:59] st 2:0:6:0: st5: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:22:59] st 2:0:6:0: Attached scsi generic sg18 type 1
[2022-06-25 03:22:59] scsi 2:0:4:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:22:59] synth uevent: /devices/vio: failed to send uevent
[2022-06-25 03:22:59] st 2:0:4:0: Attached scsi tape st6
[2022-06-25 03:22:59] st 2:0:4:0: st6: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:22:59] synth uevent: /devices/vio/4000: failed to send uevent
[2022-06-25 03:22:59] vio 4000: uevent: failed to send synthetic uevent
[2022-06-25 03:22:59] synth uevent: /devices/vio/4001: failed to send uevent
[2022-06-25 03:22:59] vio 4001: uevent: failed to send synthetic uevent
[2022-06-25 03:22:59] synth uevent: /devices/vio/4002: failed to send uevent
[2022-06-25 03:22:59] vio 4002: uevent: failed to send synthetic uevent
[2022-06-25 03:22:59] synth uevent: /devices/vio/4004: failed to send uevent
[2022-06-25 03:22:59] vio 4004: uevent: failed to send synthetic uevent
[2022-06-25 03:22:59] st 2:0:4:0: Attached scsi generic sg19 type 1
[2022-06-25 03:23:00] scsi 2:0:5:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:23:01] st 2:0:5:0: Attached scsi tape st7
[2022-06-25 03:23:01] st 2:0:5:0: st7: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:23:01] st 2:0:5:0: Attached scsi generic sg20 type 1
[2022-06-25 03:23:01] scsi 2:0:2:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:23:01] st 2:0:2:0: Attached scsi tape st8
[2022-06-25 03:23:01] st 2:0:2:0: st8: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:23:01] st 2:0:2:0: Attached scsi generic sg21 type 1
[2022-06-25 03:23:01] scsi 2:0:1:0: Sequential-Access IBM ULT3580-TD6 0107 PQ: 0 ANSI: 5
[2022-06-25 03:23:02] st 2:0:1:0: Attached scsi tape st9
[2022-06-25 03:23:02] st 2:0:1:0: st9: try direct i/o: yes (alignment 4 B)
[2022-06-25 03:23:02] st 2:0:1:0: Attached scsi generic sg22 type 1
[2022-06-25 03:23:02] scsi 2:0:0:0: Medium Changer IBM 3584 0107 PQ: 0 ANSI: 5
[2022-06-25 03:23:02] ch 2:0:0:0: Power-on or device reset occurred
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] type #1 (mt): 0x2f0+1 [medium transport]
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] type #2 (st): 0x400+25 [storage]
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] type #3 (ie): 0x300+4 [import/export]
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] type #4 (dt): 0x1+10 [data transfer]
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x1:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x2:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x3:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x4:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x5:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x6:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x7:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x8:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0x9:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] dt 0xa:
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] ID/LUN unknown
[2022-06-25 03:23:02] ch 2:0:0:0: [ch0] INITIALIZE ELEMENT STATUS, may take some time ...
[2022-06-25 03:23:03] ch 2:0:0:0: [ch0] ... finished
[2022-06-25 03:23:03] ch 2:0:0:0: Attached scsi changer ch0
[2022-06-25 03:23:03] ch 2:0:0:0: Attached scsi generic sg23 type 8
[2022-06-25 03:28:55] st 2:0:10:0: Power-on or device reset occurred
[2022-06-25 03:28:55] st 2:0:9:0: Power-on or device reset occurred
[2022-06-25 03:28:56] st 2:0:8:0: Power-on or device reset occurred
[2022-06-25 03:28:56] st 2:0:7:0: Power-on or device reset occurred
[2022-06-25 03:28:56] st 2:0:6:0: Power-on or device reset occurred
Thanks for the bug report.
I'm pretty sure the issue is with mhVTL - the udev team would have fixed any discrepancies like this a long time ago.
I'm looking at it now and suspect it is a race condition where the logical unit is being created before the vtltape daemon is ready to actually process any SCSI commands - hence if the kernel issues 'inquiry' to the logical unit before it is fully setup, the responses are incomplete.
Hello Peter,
Just to confirm this is a suspect race condition at startup, are you able to run the following command to reload udev rule set if you see the above condition occur
udevadm control -R
udevadm trigger
Hi, Mark.
Yes, I ran the above and then saw the complete udev rules. I had these mhvtl volumes before running the above:
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_01 -> ../st0
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_03 -> ../st9
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_04 -> ../st5
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_05 -> ../st3
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_06 -> ../st8
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_07 -> ../st1
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_08 -> ../st7
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_09 -> ../st2
lrwxrwxrwx 1 root root 6 Jul 14 22:34 mhvtl_lto-6_10 -> ../st4
and then these afterward:
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_01 -> ../st0
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_02 -> ../st6
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_03 -> ../st9
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_04 -> ../st5
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_05 -> ../st3
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_06 -> ../st8
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_07 -> ../st1
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_08 -> ../st7
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_09 -> ../st2
lrwxrwxrwx 1 root root 6 Jul 14 23:53 mhvtl_lto-6_10 -> ../st4
So after doing these commands:
udevadm control -R
udevadm trigger
the link for mhvtl_lto-6_02
to st6
appeared. I looked in /var/log/messages and found this shortly after the reboot when the udev rules should have been run:
Jul 14 22:34:25 lews kernel: st 2:0:2:0: Attached scsi tape st6
Jul 14 22:34:25 lews kernel: st 2:0:2:0: st6: try direct i/o: yes (alignment 4 B)
Jul 14 22:34:25 lews kernel: st 2:0:2:0: Attached scsi generic sg19 type 1
Jul 14 22:34:25 lews /usr/bin/vtltape[2342]: add_lu(): Wrote 'add 12 0 2 0#012' (13 bytes) to /sys/bus/mhvtl/drivers/mhvtl/add_lu
Jul 14 22:34:25 lews /usr/bin/vtltape[2342]: add_lu(): Child anounces 'lu [0:2:0] created'.
Jul 14 22:34:25 lews /usr/bin/vtltape[2240]: main(): Cleaning up after add_lu child pid: 2342
I have been noticing this issue for several years but just got around to reporting it with this issue.
Thank you, Peter
Hi Peter, I must admit to having seen this issue from time-to-time (but not a user of custom udev rules as such - hasn't bothered me).
However, a whole heap of theories have been thrown out over the last couple of weeks and I have finally identified the cause. (systemd / udev in debug mode without any rate limiting)
Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(err) 'scsi_id: cannot open /dev/bsg/5:0:2:0: No such file or directory' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(out) 'ID_SCSI=1' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(out) 'ID_VENDOR=' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(out) 'ID_VENDOR_ENC=' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(out) 'ID_MODEL=' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(out) 'ID_MODEL_ENC=' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(out) 'ID_REVISION=' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0'(out) 'ID_TYPE=' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: Process 'scsi_id --whitelisted --export --device=/dev/bsg/5:0:2:0' succeeded. Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: IMPORT builtin 'path_id' /usr/lib/udev/rules.d/60-persistent-storage-tape.rules:25 Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: IMPORT builtin 'path_id' returned non-zero Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: handling device node '/dev/nst7', devnum=c9:135, mode=0660, uid=0, gid=33 Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: set permissions /dev/nst7, 020660, uid=0, gid=33 Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: creating symlink '/dev/char/9:135' to '../nst7' Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: seq 2848 processed Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19428]: seq 2855 running Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19428]: IMPORT builtin 'hwdb' /usr/lib/udev/rules.d/50-udev-default.rules:14 Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19428]: IMPORT builtin 'hwdb' returned non-zero Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19428]: seq 2855 processed Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19428]: seq 2856 running Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19428]: seq 2856 processed Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: seq 2858 running Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: handling device node '/dev/bsg/5:0:2:0', devnum=c247:16, mode=0600, uid=0, gid=0 Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: preserve permissions /dev/bsg/5:0:2:0, 020600, uid=0, gid=0 Jul 21 11:26:21 mhvtldev-localdomain systemd-udevd[19429]: creating symlink '/dev/char/247:16' to '../bsg/5:0:2:0'
i.e. the udev SCSI tape rule is attempting to access the passthrough path before udev has created the device.
Now to figure out a way to encourage udev to play nice and setup /dev/bsg/ paths before the persistent tape rules :)
Ideas and suggestions most welcome.
FWIW: 'udevadm trigger' once the /dev/bsg/* devices are created will re-create any missing paths by your udev rules.
Yeah, we use udevadm trigger
to recreate the udev paths and I think that nearly always puts back the missing paths.. However, I think some time later additional paths might be missing.
Looking into udev/systemd logs show 'scsi_id' being unable to open /dev/bsg/xxx ( hence unable to retrieve scsi parameters) because the persistent tape rules are run before the rules to create /dev/bsg/* and /dev/sg* get executed
FWIW: I recently 'yum update -y' and the udev issues I was seeing has gone away..
I do see that 'systemd-udev' package was updated (I didn't capture the version before hand).
systemd-udev-239-60.el8.x86_64
To debug udev rules:
Update /etc/udev/udev.conf udev_log="debug"
Don't start mhvtl on boot:
systemctl disable mhvtl.target
Restart the system:
In one terminal - monitor systemd via journalctl -f
In another terminal - Start mhVTL systemctl start mhvtl.target
Flip back to the 'journalctl -f' session to view the behaviour of udev.
If you are still encountering this issue, please capture all the 'journalctl -f' output and make it available.
Cheers
Mark
Hi, Mark.
I updated four of our test systems to have udev_log="debug"
in the /etc/udev/udev.conf
file. However, I don't understand why I have to disable mhvtl.target. The next time I notice that we don't have all the entries expected from the udev rules, cannot I look in the /var/log/messages
file to see what might have gone wrong and share any relevant output with you at that time?
Thank you, Peter
Even though udev is in debug, but it does not appear to make it to /var/log/messages
The journalctl -f
seems to be the only way to capture the info (not that I've gone looking too hard).
The disable start of mhvtl.target is just so a terminal journalctl -f
can be run before the start of mhvtl.target
BTW: What distro and version are you running ? I get no problems with 'openSUSE Leap' I'm just re-checking CentOS7 now...
We are using RHEL 8 but this problem also happens on RHEL 7 machines.
Even though udev is in debug, but it does not appear to make it to /var/log/messages The
journalctl -f
seems to be the only way to capture the info (not that I've gone looking too hard). The disable start of mhvtl.target is just so a terminaljournalctl -f
can be run before the start of mhvtl.target
Oh, I did not know that it would not make it to /var/log/messages. I was hoping to let mhvtl.target run automatically and then when the problem occurs take a look at the journalctl
command to see what happened. Won't that work also?
Most likely I try it out a little later