ndctl icon indicating copy to clipboard operation
ndctl copied to clipboard

ndctl keys not removed after ndctl sanitize-dimm nmem0 --overwrite

Open yizhanglinux opened this issue 2 years ago • 18 comments

Hello I found after sanitize-dimm --overwrite operation, the ndctl key still existing there and not removed, is it by design, but from the man page, the key should be removed after sanitize-dimm operation.

From man ndctl sanitize-dimm
Additionally, after completion of this command, the security and passphrase for the given NVDIMM will be disabled, and the passphrase and any key material will also be removed from the keyring and the ndctl keys directory at /etc/ndctl/keys
# ndctl setup-passphrase "$dev" -k user:"$masterkey"
passphrase enabled for 1 nmem.
# ndctl sanitize-dimm nmem0 --overwrite
overwrite issued for 1 nmem.
# ndctl list -Di
[
  {
    "dev":"nmem1",
    "id":"8089-a2-1833-00000510",
    "handle":257,
    "phys_id":32,
    "flag_failed_map":true,
    "security":"disabled"
  },
  {
    "dev":"nmem3",
    "id":"8089-a2-1833-00000497",
    "handle":4353,
    "phys_id":44,
    "security":"disabled"
  },
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"overwrite"
  },
  {
    "dev":"nmem2",
    "id":"8089-a2-1833-000004a9",
    "handle":4097,
    "phys_id":38,
    "security":"disabled"
  }
]
# ndctl wait-overwrite nmem0
# ls /etc/ndctl/keys/
keys.readme
nvdimm_8089-a2-1833-000004a3_intel-purley-04.khw1.lab.eng.bos.redhat.com.blob
nvdimm-master.blob

yizhanglinux avatar May 04 '23 04:05 yizhanglinux

if I only do sanitize-dimm operation, the key can be removed.

# ndctl sanitize-dimm nmem0
sanitized 1 nmem.

# ls /etc/ndctl/keys/
keys.readme  nvdimm-master.blob

yizhanglinux avatar May 04 '23 04:05 yizhanglinux

https://lore.kernel.org/nvdimm/168357518158.2750073.1393407560977941832.stgit@djiang5-mobl3/

Can you please try this fix and see if that does the job? Thanks!

davejiang avatar May 08 '23 19:05 davejiang

https://lore.kernel.org/nvdimm/168357518158.2750073.1393407560977941832.stgit@djiang5-mobl3/

Can you please try this fix and see if that does the job? Thanks!

Hi Dave

I tried your patch, after ndctl sanitize-dimm nmem0 --overwrite operation[1], the overwrite still issued to nmem0, I filed another issue[2], finally the key was removed, but the dimm nmem0 stays "unlocked" state[3] and the security cannot be disabled on nmem0[4], now seems I can do nothing to disable the security. :(

[1]

# ./ndctl sanitize-dimm nmem0 --overwrite
libndctl: ndctl_dimm_enable: nmem0: failed to enable
overwrite issued for 0 nmem.

[2] https://github.com/pmem/ndctl/issues/244

[3]

# ndctl  list -Di
[
  {
    "dev":"nmem1",
    "id":"8089-a2-1833-00000510",
    "handle":257,
    "phys_id":32,
    "flag_failed_map":true,
    "security":"disabled"
  },
  {
    "dev":"nmem3",
    "id":"8089-a2-1833-00000497",
    "handle":4353,
    "phys_id":44,
    "security":"disabled"
  },
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem2",
    "id":"8089-a2-1833-000004a9",
    "handle":4097,
    "phys_id":38,
    "security":"disabled"
  }
]

[4]

#./ndctl remove-passphrase nmem0
failed to open file /etc/ndctl/keys/nvdimm_8089-a2-1833-000004a3_intel-purley-04.khw1.lab.eng.bos.redhat.com.blob: No such file or directory
Unable to load key
passphrase removed for 0 nmem.

# ./ndctl sanitize-dimm nmem0
failed to open file /etc/ndctl/keys/nvdimm_8089-a2-1833-000004a3_intel-purley-04.khw1.lab.eng.bos.redhat.com.blob: No such file or directory
Unable to load key
sanitized 0 nmem.

yizhanglinux avatar May 09 '23 15:05 yizhanglinux

Do you have CONFIG_NVDIMM_SECURITY_TEST=y in your kernel config? I talked to Vishal and he said it works for him. The only thing I can think of right now is that you don't have that config on and it doesn't do the extra poll to update the security state when using ndtest and therefore it remains in "locked" state.

davejiang avatar May 09 '23 15:05 davejiang

Do you have CONFIG_NVDIMM_SECURITY_TEST=y in your kernel config? I talked to Vishal and he said it works for him. The only thing I can think of right now is that you don't have that config on and it doesn't do the extra poll to update the security state when using ndtest and therefore it remains in "locked" state.

Yes, the CONFIG_NVDIMM_SECURITY_TEST was enabled. The dimm nmem0 I used is one real nvdimm HW, I also tried modprobe nfit_test and using nmem4 do the same test, it has the same behavior.

# cat .config | grep CONFIG_NVDIMM_SECURITY_TEST
CONFIG_NVDIMM_SECURITY_TEST=y

# ./ndctl list -Di
[
 
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem4",
    "id":"cdab-0a-07e0-ffffffff",
    "handle":0,
    "phys_id":0,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem6",
    "id":"cdab-0a-07e0-fffeffff",
    "handle":256,
    "phys_id":2,
    "security":"disabled"
  }
]

yizhanglinux avatar May 09 '23 15:05 yizhanglinux

So issue 239, where key blob isn't removed after overwrite, is addressed correct? The remaining issue is 244, where overwrite is issued anyways even though there's error of some sort?

davejiang avatar May 09 '23 16:05 davejiang

So issue 239, where key blob isn't removed after overwrite, is addressed correct? The remaining issue is 244, where overwrite is issued anyways even though there's error of some sort?

yes, the key was removed finally with your patch. But the security was enabled:unlocked and cannot be disabled now, it's better we can fix it first(disable the security), or other user maybe also run into such situation.

  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },

For https://github.com/pmem/ndctl/issues/244, maybe we just need to fix the output. :)

yizhanglinux avatar May 09 '23 16:05 yizhanglinux

BTW, since my dimm nmem0's security feature was enabled:unlocked and no key now, do you know how to disable the security w/o key.

yizhanglinux avatar May 09 '23 16:05 yizhanglinux

Did you call ndctl wait-overwrite nmem0 to wait for overwrite completion first before checking the state?

davejiang avatar May 09 '23 16:05 davejiang

Yes, I already called that cmd.

# ndctl wait-overwrite nmem0
# ndctl wait-overwrite nmem4
# ndctl list -Di
[
  {
    "dev":"nmem1",
    "id":"8089-a2-1833-00000510",
    "handle":257,
    "phys_id":32,
    "flag_failed_map":true,
    "security":"disabled"
  },
  {
    "dev":"nmem3",
    "id":"8089-a2-1833-00000497",
    "handle":4353,
    "phys_id":44,
    "security":"disabled"
  },
  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":1,
    "phys_id":26,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem2",
    "id":"8089-a2-1833-000004a9",
    "handle":4097,
    "phys_id":38,
    "security":"disabled"
  },
  {
    "dev":"nmem9",
    "id":"cdab-0a-07e0-fefffeff",
    "handle":65537,
    "phys_id":0,
    "flag_failed_map":true
  },
  {
    "dev":"nmem8",
    "id":"cdab-0a-07e0-fffffeff",
    "handle":65536,
    "phys_id":0,
    "flag_failed_save":true,
    "flag_failed_arm":true,
    "flag_failed_restore":true,
    "flag_failed_flush":true,
    "flag_smart_event":true
  },
  {
    "dev":"nmem5",
    "id":"cdab-0a-07e0-feffffff",
    "handle":1,
    "phys_id":1,
    "security":"disabled"
  },
  {
    "dev":"nmem7",
    "id":"cdab-0a-07e0-fefeffff",
    "handle":257,
    "phys_id":3,
    "security":"disabled"
  },
  {
    "dev":"nmem4",
    "id":"cdab-0a-07e0-ffffffff",
    "handle":0,
    "phys_id":0,
    "state":"disabled",
    "security":"unlocked"
  },
  {
    "dev":"nmem6",
    "id":"cdab-0a-07e0-fffeffff",
    "handle":256,
    "phys_id":2,
    "security":"disabled"
  }
]

yizhanglinux avatar May 09 '23 16:05 yizhanglinux

Looking at the DSM 1.8 spec, I'm starting to get the feeling that overwrite does not change the security state of being enabled. And that when I implemented overwrite, maybe there was a reason that the key blob was not removed. Sorry it's been a few years since I looked at this stuff. What happens if you reboot? Does it come back as locked? There may be a way to recover via BIOS reset of the DIMM. Otherwise the DIMM may be unrecoverable. :( Do you still have the Intel contact that you guys originally got the DIMM from?

https://pmem.io/documents/NVDIMM_DSM_Interface-V1.8.pdf

davejiang avatar May 09 '23 16:05 davejiang

Also, is this a Crow Pass on Sapphire Rapids or some other DIMM on a different platform? Trying to find some help internally....

davejiang avatar May 09 '23 17:05 davejiang

If your BIOS has the feature: Boot to the UEFI menu and enable Secure Erase Unit for the module(s)

  1. UEFI EDKII > Socket Configuration > Memory Configuration > PMem Configuration > PMem Secure Erase Unit
  2. Reset/reboot the system

Otherwise, we may need to investigate other means.

davejiang avatar May 09 '23 18:05 davejiang

Looking at the DSM 1.8 spec, I'm starting to get the feeling that overwrite does not change the security state of being enabled. And that when I implemented overwrite, maybe there was a reason that the key blob was not removed. Sorry it's been a few years since I looked at this stuff. What happens if you reboot? Does it come back as locked? There may be a way to recover via BIOS reset of the DIMM. Otherwise the DIMM may be unrecoverable. :( Do you still have the Intel contact that you guys originally got the DIMM from?

https://pmem.io/documents/NVDIMM_DSM_Interface-V1.8.pdf

OK, so it's expected to not remove the key with "overwrite" operation, it was locked after reboot.

  {
    "dev":"nmem0",
    "id":"8089-a2-1833-000004a3",
    "handle":"0x1",
    "phys_id":"0x1a",
    "state":"disabled",
    "security":"locked"
  },

I will check with our hw team if they can help me reset the DIMM.

yizhanglinux avatar May 10 '23 02:05 yizhanglinux

If your BIOS has the feature: Boot to the UEFI menu and enable Secure Erase Unit for the module(s)

  1. UEFI EDKII > Socket Configuration > Memory Configuration > PMem Configuration > PMem Secure Erase Unit
  2. Reset/reboot the system

Otherwise, we may need to investigate other means.

It's should be Intel Purley, Wolf Pass, I checked the BIOS and no such option. :(

yizhanglinux avatar May 10 '23 02:05 yizhanglinux

Can you open up an IPS case so Intel can track it? We can look into how to get that DIMM serviced.

davejiang avatar May 10 '23 04:05 davejiang

Can you open up an IPS case so Intel can track it? We can look into how to get that DIMM serviced.

I've asked our HW team to do that, thanks for the help.

yizhanglinux avatar May 12 '23 09:05 yizhanglinux

Thanks! Sorry about the troubles.

davejiang avatar May 12 '23 14:05 davejiang