ndctl create-namespace error
Problem
ndctl create-namespace can't work on kernel 4.8.12
sudo ndctl create-namespace -m devdax -e namespace1.0 -s 4295000000 -f -v
Then, I get an error as follows.
setup_namespace:417: dax1.0: set_align failed: Invalid argument
failed to reconfigure namespace: Invalid argument
Env
ndctl list
[
{
"dev":"namespace1.0",
"mode":"fsdax",
"map":"mem",
"size":4294967296,
"sector_size":512,
"blockdev":"pmem1"
}
]
ndctl --version
67
@glock42
The problem is with the -s 4295000000 argument. Note, ndctl uses base 1024 not base 1000. ndctl also supports using G(igabytes), M(egabytes), and K(ilobytes) for the --size option to make things easier.
I'm using ndctl v67 on linux kernel 5.3 where the messaging is much better, ie:
# sudo ndctl create-namespace -m devdax -e namespace1.0 -s 4295000000 -f
Error: '--size=' must align to interleave-width: 6 and alignment: 2097152
did you intend --size=4303355904?
failed to reconfigure namespace: Invalid argument
Do you see anything like the above?
I have 6 DCPMMs per CPU socket installed in my system, hence the interleave-width: 6. Device-dax namespaces are 2MB aligned by default. ndctl will perform the correct alignment calculation and provide the correct size. In my case, it's saying the size should be 4303355904. Using the calculated size provides a successful reconfiguration:
# sudo ndctl create-namespace -m devdax -e namespace1.0 -s 4303355904 -f
{
"dev":"namespace1.0",
"mode":"devdax",
"map":"dev",
"size":"3.94 GiB (4.23 GB)",
"uuid":"59f34564-000a-419b-b0ea-3a823d13c2f7",
"daxregion":{
"id":1,
"size":"3.94 GiB (4.23 GB)",
"align":2097152,
"devices":[
{
"chardev":"dax1.0",
"size":"3.94 GiB (4.23 GB)",
"target_node":3,
"mode":"devdax"
}
]
},
"align":2097152
}
Assuming you also have 6 DCPMMs, the above size value should work for you also.
HTH Steve
@sscargal Thanks for your reply.
I try to use your size. But it still not work.
sudo ndctl create-namespace -m devdax -e namespace1.0 -s 4303355904 -f
Error: '--size=' must align to interleave-width: 1 and alignment: 200000
did you intend --size=4303400000?
failed to reconfigure namespace: Invalid argument
But if I change size to 4303400000, it still failed, and the /dev/pmem1 disappeared.
sudo ndctl create-namespace -m devdax -e namespace1.0 -s 4303400000 -f
failed to reconfigure namespace: Invalid argument
The v4.18.12 kernel has been end-of-lifed for over 3 years. Please move to a kernel that is under active development as listed on kernel.org.
@djbw Thanks for you reply.
I do not think updating the kernel version is a good idea, because we need to use the same kernel to do some evaluation test. Is there any other solution to solve this problem? Thanks!
@glock42 You appear to have just one NVDIMM in this region (interleave-width: 1) which is why my --size value did not work for you (I have 6 NVDIMMs).
There are two reasons to explain why /dev/pmem disappeared:
- You are converting the
fsdaxnamespace (/dev/pmem1) todevdaxwhich uses/dev/dax1.0naming convention. Do you see a/dev/dax1.0entry? - The namespace could be disabled since the operation failed. You can check for disabled namespaced using
ndctl list -iNand you should seenamespace1.0in the list.
At this point, I would simply destroy it:
sudo ndctl destroy-namespace -f namespace1.0
Then, I would let ndctl do the heavy lifting for you using:
sudo ndctl create-namespace --region=region1 --mode=devdax --size=4G
While it is possible to calculate the actual size in bytes, things change depending on the kernel version, label space version, and namespace type, so it's best to let ndctl and the kernel hand this.
If the above create-namespace command also fails, try changing the size to --size=5G and ---size-6G to accommodate the metadata. With a single NVDIMM, the size calculation is much simpler. You can read more about capacity considerations in the docs.
@sscargal
- I didn't see the
/dev/dax1.0entry - I use
ndctl list -iN, thenamespace1.0mode seems becoming thedevdax, but the state isdisabled
$: ndctl list -iN
[
{
"dev":"namespace1.0",
"mode":"devdax",
"map":"dev",
"uuid":"1e118e37-d966-4408-ac4a-9c64f8f442e9",
"state":"disabled"
},
{
"dev":"namespace0.0",
"mode":"devdax",
"map":"dev",
"uuid":"d081a928-6fe0-4aea-8f4a-8a8f0367b228",
"state":"disabled"
}
]
- After I destroy the namespace1.0, and use
sudo ndctl create-namespace --region=region1 --mode=devdax --size=4G, the machine stuck on this command.
You need to move or at least test with an updated kernel to find out which fixes you need to backport to your v4.8.12 baseline. However, at an absolute minimum you should move to v4.9 that is still receiving security updates. You are likely to exposed to more than one unpatched security flaw in a kernel that has not seen security updates for multiple years. There was a major overhaul of namespace management in v4.9 that you are missing. Below are all the updates in v4.9-stable that are not in v4.8. I otherwise do not have bandwidth to help you debug a v4.8.12 kernel.
7839be200a1c libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields ef27496f4165 libnvdimm/namespace: Fix label tracking error 4d8fc7d8d7ed libnvdimm/btt: Fix a kmemdup failure check 93cc83319f9a libnvdimm/namespace: Fix a potential NULL pointer dereference f45c6c3affb5 libnvdimm: Fix altmap reservation size calculation 9f98f270a5a0 libnvdimm/pmem: Honor force_raw for legacy pmem regions 446553287d17 libnvdimm/label: Clear 'updating' flag after label-set update 3e63a7f25cc8 libnvdimm: Hold reference on parent while scheduling async init 05a085c7c501 libnvdimm: fix ars_status output length calculation f216d1e9339d linvdimm, pmem: Preserve read-only setting for pmem devices a1ada79437d7 libnvdimm, namespace: use a safe lookup for dimm device name 6fa877d2aca8 libnvdimm, {btt, blk}: do integrity setup before add_disk() 807e3365895c libnvdimm, dax: fix 1GB-aligned namespaces vs physical misalignment be38759eb2d6 device-dax: implement ->split() to catch invalid munmap attempts 29c969c3031b libnvdimm: fix integer overflow static analysis warning c9ca9d9d9b79 libnvdimm, btt: Fix an incompatibility in the log layout 423716cf2815 libnvdimm, pfn: fix start_pad handling for aligned namespaces 194eb4a4fcd2 libnvdimm, namespace: make 'resource' attribute only readable by root 6e83c891b68a libnvdimm, namespace: fix label initialization to use valid seq numbers 2224973f18dc libnvdimm, pfn: make 'resource' attribute only readable by root a3ff46097a1d device-dax: fix sysfs duplicate warnings 0fa705dc61ee libnvdimm: fix badblock range handling of ARS range 891c31e16cb7 libnvdimm, btt: fix btt_rw_page not returning errors 8eaaf66d41ad pmem: return EIO on read_pmem() failure fa313fd6673e libnvdimm: fix clear length of nvdimm_forget_poison() 1a1029507258 libnvdimm, pfn: fix 'npfns' vs section alignment c171b24fe508 libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify 5b6e7f353290 libnvdimm, region: fix flush hint detection crash 8dd114ef78c8 device-dax: fix cdev leak c36eaa6ca346 device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation 5ac50e714f60 libnvdimm: fix reconfig_mutex, mmap_sem, and jbd2_handle lockdep splat 5f377c4ad271 libnvdimm: fix blk free space accounting eae72468c45d device-dax: fix pmd/pte fault fallback handling 9ad1571da2c0 nfit, libnvdimm: fix interleave set cookie calculation cd755677d944 libnvdimm, pfn: fix memmap reservation size versus 4K alignment ebffa7bc77c8 libnvdimm, namespace: do not delete namespace-id 0 3c4d83a1a41e libnvdimm, namespace: fix pmem namespace leak, delete when size set to zero 630a2ef354bb libnvdimm, pfn: fix align attribute 325896ffdf90 device-dax: fix private mapping restriction, permit read-only efda1b5d87cb acpi, nfit, libnvdimm: fix / harden ars_status output length handling 4cb19355ea19 device-dax: fail all private mapping attempts 6a84fb4b4e43 device-dax: check devm_nsio_enable() return value 52e73eb2872c device-dax: fix percpu_ref_exit ordering 867dfe342118 nvdimm: make CONFIG_NVDIMM_DAX 'bool' 3115bb02b5c2 pmem: report error on clear poison failure 75d29713b792 libnvdimm, namespace: potential NULL deref on allocation error 4e65e9381c7a /dev/dax: fix Kconfig dependency build breakage bc0a0fe94f33 dax: use correct dev_t value d76911ee933a dax: convert devm_create_dax_dev to PTR_ERR 98a29c39dc68 libnvdimm, namespace: allow creation of multiple pmem-namespaces per region 991d9020f3e0 libnvdimm, namespace: lift single pmem limit in scan_labels() c969e24c1b69 libnvdimm, namespace: filter out of range labels in scan_labels() 762d067dbad5 libnvdimm, namespace: enable allocation of multiple pmem namespaces 16660eaea0cc libnvdimm, namespace: update label implementation for multi-pmem 012207334a26 libnvdimm, namespace: expand pmem device naming scheme for multi-pmem a1f3e4d6a0c3 libnvdimm, region: update nd_region_available_dpa() for multi-pmem support 6ff3e912d32e libnvdimm, namespace: sort namespaces by dpa at init 0e3b0d123c8f libnvdimm, namespace: allow multiple pmem-namespaces per region at scan time 8a5f50d3b7f2 libnvdimm, namespace: unify blk and pmem label scanning f95b4bca9e7d libnvdimm, namespace: refactor uuid_show() into a namespace_to_uuid() helper ae8219f186d8 libnvdimm, label: convert label tracking to a linked list 44c462eb9e19 libnvdimm, region: move region-mapping input-paramters to nd_mapping_desc db58028ee4e3 nvdimm: reduce duplicated wpq flushes e046114af5fc libnvdimm: clear the internal poison_list when clearing badblocks bd697a80c329 pmem: reduce kmap_atomic sections to the memcpys only a0056afe21fd nvdimm: remove duplicate nd_mapping declaration 4765218db795 libnvdimm, namespace: debug invalid interleave-set-cookie values aee659874833 libnvdimm: Fix nvdimm_probe error on NVDIMM-N ae551e9ca289 nvdimm: Spelling s/unacknoweldged/unacknowledged/ ba9c8dd3c222 acpi, nfit: add dimm device notification support 9d2d01a031a9 dax: check resource alignment at dax region/device create 9dc1e4927bfa dax: unmap/truncate on device shutdown 3bc52c45bac2 dax: define a unified inode/address_space for device-dax mappings ba09c01d2fa8 dax: convert to the cdev api ebd84d724c85 dax: embed a struct device in dax_dev af69f51e506f dax: rename fops from dax_dev_ to dax_ 043a9255021b dax: reorder dax_fops function definitions ccdb07f62986 dax: cleanup needlessly global symbol warnings
Btw, I use the same kernel version on my local machine, it works. But it can not work on cloudlab. And I debug the ndctl source code on cloudlab, I found ndctl_pfn->module is 0x0(NULL) when ndctl execute the ndctl_dax_enable
@glock42 What Cloudlab are you referring to?
The hang condition you encountered is likely a combination of using an old kernel and/or firmware. Hangs are difficult to debug without a crash dump analysis. We did have a few hang condition issues in older kernels and firmware that have been resolved. That was the point @djbw was trying to make. I understand there are situations where you cannot upgrade, but knowing the issues are resolved means it's up to you and your distro to identify and backport the fixes. Work is done to the linux mainline kernel only and we leave backports to the distros. If you can at least test with a newer kernel to verify the issue is no longer present, that'll help a lot. We currently recommend Linux Kernel 4.19 or later as it contains performance and RAS fixes that PMDK uses.
Since it works locally, I would verify the persistent memory firmware and BIOS releases are current on the cloudlab system. You may find current BIOS and DCPMM Firmware resolves your issue.
Assuming you're using Intel Optane DC Persistent Memory, ipmctl show -dimm will return the firmware of the DIMMs and dmidcecode -t bios will return the BIOS version. Each server OEM provides their own BIOS and they also provide a validated DCPMM firmware, usually in the same bundle.