jc
jc copied to clipboard
Add parser for mdadm
Linux software RAID is usually implemented with the md kernel module and managed with the mdadm userspace tool. Unfortunately mdadm doesn't output it's data in a format convenient for further automated processing like json.
For example when using mdadm --query --detail /dev/md0
the output looks like this:
/dev/md0:
Version : 1.1
Creation Time : Tue Apr 13 23:22:16 2010
Raid Level : raid1
Array Size : 5860520828 (5.46 TiB 6.00 TB)
Used Dev Size : 5860520828 (5.46 TiB 6.00 TB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Jul 26 20:16:31 2022
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Consistency Policy : bitmap
Name : virttest:0
UUID : 85c5b164:d58a5ada:14f5fe07:d642e843
Events : 2193679
Number Major Minor RaidDevice State
3 8 17 0 active sync /dev/sdb1
2 8 33 1 active sync /dev/sdc1
mdadm also has the "examine" command which gives information about a md raid member device:
# mdadm --examine -E /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 1.1
Feature Map : 0x1
Array UUID : 85c5b164:d58a5ada:14f5fe07:d642e843
Name : virttest:0
Creation Time : Tue Apr 13 23:22:16 2010
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 11721041656 sectors (5.46 TiB 6.00 TB)
Array Size : 5860520828 KiB (5.46 TiB 6.00 TB)
Data Offset : 264 sectors
Super Offset : 0 sectors
Unused Space : before=80 sectors, after=0 sectors
State : clean
Device UUID : 813162e5:2e865efe:02ba5570:7003165c
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Jul 26 20:16:31 2022
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : f141a577 - correct
Events : 2193679
Device Role : Active device 0
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
Having a parser for this in jc would be really helpful when dealing with md raid in scripts.
Thank you for the parser suggestion! Yeah, it probably makes sense to have a dedicated parser for this. In the mean time you could try the Key/Value parser. (jc --kv
) It won't be too fancy, but it might get the job done in the short-term.
Edit: The kv
parser will hiccup on the indented fields because it looks for indentation to consolidate lines. (like a long text string). You'd have to first remove all initial spaces before running through jc
for a better result.
Thanks for the quick reaction and suggesting jc --kv
.
Unfortunately the most important information is often the tabular section at the end:
Number Major Minor RaidDevice State
3 8 17 0 active sync /dev/sdb1
2 8 33 1 active sync /dev/sdc1
Because this tells you which devices are part of the md array and which state they are in. I don't think kv will help much with this...
Could you take a look at this library to see if the output works for you? If so, I can vendor this library into jc
instead of writing a parser from scratch.
https://github.com/truveris/py-mdstat
Ah, nevermind - that's mdstat
not mdadm
. :)
Ah, nevermind - that's
mdstat
notmdadm
. :)
Yeah. I have seen this library before opening this issue.
It is parsing /proc/mdstat
. mdstat gives some of the necessary data, but not all. For example the UUIDs are missing. It also just gives information about currently active raid devices, so it does not cover the --examine
mode that can query devices that are offline.
When you seriously consider to develop a mdadm parser, I could lend some help creating example output of different raid scenarios and states.
If you could add more samples that would be great. I will probably release a new version of jc
in the next couple weeks and I'd like to get this one in there. The samples are the hardest part to get sometimes so that would be very helpful. Thanks!
This sound good. I will create a variety of sample output files in different states and raid levels in the next couple of days and upload them here.
Doing some more research on this command. I don't have a raid setup so it's a little difficult for me to check out all of the output possibilities. I noticed this in the man page:
-Y, --export
When used with --detail, --detail-platform, --examine, or
--incremental output will be formatted as key=value pairs
for easy import into the environment.
With --incremental The value MD_STARTED indicates whether
an array was started (yes) or not, which may include a
reason (unsafe, nothing, no). Also the value MD_FOREIGN
indicates if the array is expected on this host (no), or
seems to be from elsewhere (yes).
Can you show me what the --export
output looks like? I'm thinking it could be used with the K/V or env
parser. I'm not sure if it still includes the table output, though.
If there is still valuable data that is not exported in a machine-friendly format, it doesn't look like it would be too difficult to write this parser. I just need more samples so I can figure out the schema. For example, most commands jc
will output an array of objects. It looks like this command will only output a single item, so maybe single object output will be better. I don't have a system to test and see if there are scenarios where multiple devices can be examined/queried at the same time with this command.
I was just getting started collecting example output for you, beginning with raid0.
Unfortunately the --export
output is quite a letdown, it is missing most of the information.
Here an example of the same array:
# mdadm -Q --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sun Aug 7 20:56:52 2022
Raid Level : raid0
Array Size : 200704 (196.00 MiB 205.52 MB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Update Time : Sun Aug 7 20:56:52 2022
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : -unknown-
Chunk Size : 512K
Consistency Policy : none
Name : sysrescue:0 (local to host sysrescue)
UUID : 7e81a856:abb9c1c2:4b71237a:9778cc66
Events : 0
Number Major Minor RaidDevice State
0 254 1 0 active sync /dev/vda1
1 254 2 1 active sync /dev/vda2
and this is what export gives you:
# mdadm -Q --detail --export /dev/md0
MD_LEVEL=raid0
MD_DEVICES=2
MD_METADATA=1.2
MD_UUID=7e81a856:abb9c1c2:4b71237a:9778cc66
MD_NAME=sysrescue:0
MD_DEVICE_dev_vda2_ROLE=1
MD_DEVICE_dev_vda2_DEV=/dev/vda2
MD_DEVICE_dev_vda1_ROLE=0
MD_DEVICE_dev_vda1_DEV=/dev/vda1
So I think we can forget using the --export
mode.
No problem, here is what I have so far in dev
:
$ cat mdadm-query-detail.out | jc --mdadm -p
{
"device": "/dev/md0",
"version": "1.1",
"creation_time": "Tue Apr 13 23:22:16 2010",
"raid_level": "raid1",
"array_size": "5860520828 (5.46 TiB 6.00 TB)",
"used_dev_size": "5860520828 (5.46 TiB 6.00 TB)",
"raid_devices": 2,
"total_devices": 2,
"persistence": "Superblock is persistent",
"intent_bitmap": "Internal",
"update_time": "Tue Jul 26 20:16:31 2022",
"state": "clean",
"active_devices": 2,
"working_devices": 2,
"failed_devices": 0,
"spare_devices": 0,
"consistency_policy": "bitmap",
"name": "virttest:0",
"uuid": "85c5b164:d58a5ada:14f5fe07:d642e843",
"events": 2193679,
"device_table": [
{
"number": 3,
"major": 8,
"minor": 17,
"state": [
"active",
"sync"
],
"device": "/dev/sdb1",
"raid_device": 0
},
{
"number": 2,
"major": 8,
"minor": 33,
"state": [
"active",
"sync"
],
"device": "/dev/sdc1",
"raid_device": 1
}
],
"array_size_num": 5860520828,
"used_dev_size_num": 5860520828
}
$ cat mdadm-examine.out| jc --mdadm -p
{
"device": "/dev/sdb1",
"magic": "a92b4efc",
"version": "1.1",
"feature_map": "0x1",
"array_uuid": "85c5b164:d58a5ada:14f5fe07:d642e843",
"name": "virttest:0",
"creation_time": "Tue Apr 13 23:22:16 2010",
"raid_level": "raid1",
"raid_devices": 2,
"avail_dev_size": "11721041656 sectors (5.46 TiB 6.00 TB)",
"array_size": "5860520828 KiB (5.46 TiB 6.00 TB)",
"data_offset": 264,
"super_offset": 0,
"unused_space": "before=80 sectors, after=0 sectors",
"state": "clean",
"device_uuid": "813162e5:2e865efe:02ba5570:7003165c",
"internal_bitmap": "8 sectors from superblock",
"update_time": "Tue Jul 26 20:16:31 2022",
"bad_block_log": "512 entries available at offset 72 sectors",
"checksum": "f141a577 - correct",
"events": 2193679,
"device_role": "Active device 0",
"array_state": "AA ('A' == active, '.' == missing, 'R' == replacing)",
"array_size_num": 5860520828,
"avail_dev_size_num": 11721041656,
"unused_space_before": 80,
"unused_space_after": 0,
"array_state_list": [
"active",
"active"
]
}
These are from the two original samples. Let me know if that looks workable for you.
Thanks!
Here are many different states for raid0 and raid1: raid0-raid1.zip
There are some features of md raid I'm not familiar with, for example containers and clusters. Also there are the different raid levels like 4, 5, 6 still missing. So still a lot to do.
But I think with the attached test cases you can make some progress.
"state": "clean",
as you can see in my examples, there are several states that can be active at once:
State : clean, degraded, recovering
. So this will probably have to be a list.
When looking at your device table:
"device_table": [
{
"number": 3,
[...]
},
{
"number": 2,
[...]
}
],
It is a list of dicts. Isn't that quite a complicated structure? To me it looks like the devices always have a "number", but having a "raid_device" is optional and depends on if a device is in use or spare. So wouldn't it be better to use a dict with the "number" as index?
When looking at a device in the table:
{
"number": 3,
"major": 8,
"minor": 17,
"state": [
"active",
"sync"
],
"device": "/dev/sdb1",
"raid_device": 0
},
Unfortunately mdadm mixes states and flags in it's output. See my examples with failfast and write-mostly, those are flags and not states. I'm not sure if the states and flags can be properly divided from the output though.
I found some more examples in this bugzilla report where users are complaining that the output format is inconsistent: https://bugzilla.redhat.com/show_bug.cgi?id=1380034
I don't have a raid setup so it's a little difficult for me to check out all of the output possibilities.
Just an idea if you want to try it for yourself: use a virtual machine and add a second, empty virtual disk to it. Then you can use that to play around with the different setups. This is what I have been doing to create the output examples.
I've been using https://www.system-rescue.org/ as linux in the virtual machine, because you don't have to install anything, just add the iso image and it boots.
Thanks! Working through these samples. I have an initial version in dev
you can play with. I'll take a look at some of your feedback as well.
https://github.com/kellyjonbrazil/jc/blob/dev/jc/parsers/mdadm.py (you can put that in your plugin directory to test)
Ok, I think I have addressed the outstanding issues. Could you test with the dev
parser linked above?
as you can see in my examples, there are several states that can be active at once: State : clean, degraded, recovering. So this will probably have to be a list.
Yep, added a state_list
field.
To me it looks like the devices always have a "number", but having a "raid_device" is optional and depends on if a device is in use or spare. So wouldn't it be better to use a dict with the "number" as index?
I don't think so as I noticed sometimes even number
is null. (e.g. query-raid1-faulty
)
Unfortunately mdadm mixes states and flags in it's output. See my examples with failfast and write-mostly, those are flags and not states. I'm not sure if the states and flags can be properly divided from the output though.
Yes, there are a few inconsistencies with the output. I think it makes the most sense to just append the flags to the device state in the table (as I have) since it seems impossible to tell which one is a state vs. flag.
Let me know if you run into any issues. Thanks!
Sorry for taking so long to respond, I've been busy with other stuff.
Thank you very much for your work, I think this is mostly complete.
Here are some small things I found:
I think there is one issue with the "Chunk Size", available in RAID 0 and RAID 5. The size reported is 512K, but the json contains 512.
The "Events" field in query mode for 0.9 metadata contains values like "0.13", which is converted to 0 in the json. I think the events were encoded into major.minor event counts in the 0.9 metadata format. I suggest to detect these and output them as floats, while keeping integers for everything else.
It would be nice to get the hostname the array was created for (homehost). For 0.9 metadata this is reported behind the uuid field, for newer metadata it is reported behinde the "name" field. Sometimes, when using the --homehost option, the hostname is not shown at all. I love thier consistency... Could you try to extract the hostname from either of the fields and output it as a separate field when it is available?
I have done some more testing and created some containers and RAID 5 arrays: raid5-container.zip The containers show some more inconsistent output, especially in the --examine mode. But I don't think containers play a important role in practice, most users will use the regular formats. So I would not invest too much time into getting everything out of the output.
Thanks for the feedback! I'm working through these, but just a quick update on a couple things:
I think there is one issue with the "Chunk Size", available in RAID 0 and RAID 5. The size reported is 512K, but the json contains 512.
No problem - fixing this by keeping the original field a string and adding a chunk_size_num
field that only contains the integer.
The "Events" field in query mode for 0.9 metadata contains values like "0.13", which is converted to 0 in the json. I think the events were encoded into major.minor event counts in the 0.9 metadata format. I suggest to detect these and output them as floats, while keeping integers for everything else.
I think I will need to parse this out a bit more, then. The problem with converting this to a float is that numbers like 0.10
will turn into 0.1
. I'll probably detect if there is a dot and then separate out the integers into their own fields or something like that.
Ok - the latest version in dev
is ready to test. I was able to incorporate many of your ideas. I was able to pull out a few more fields and fix some of the tables in the new examples. I didn't bother converting fields for indexed key-names, like unit[0]
. If you find any important ones let me know and I can build a quick function to grab some of those. Thanks for testing!
I tested your latest version. It is looking very good to me now, it now gets all values I care for and some more.
Thank you for implementing this parser.
Released in v1.21.0 available via pip. Binaries coming soon.