ompi icon indicating copy to clipboard operation
ompi copied to clipboard

SMSC/knem throws error messages referencing BTL/knem

Open BKitor opened this issue 3 years ago • 3 comments

I tried to build OpenMPI v5 with knem, but it fails at runtime throwing errors about btl_sm_single_copy_mechanism. While in reality the error is in smsc component and setting OMPI_MCA_coll_smsc=^knem disables the warning. I'm fairly sure it's just a documentation issue in help-smsc-knem.txt.

--------------------------------------------------------------------------
WARNING: Open MPI failed to open the /dev/knem device due to a local
error. Please check with your system administrator to get the problem
fixed, or set the btl_sm_single_copy_mechanism MCA variable to none
to silence this warning and run without knem support.

The sm shared memory BTL will fall back on another single-copy
mechanism if one is available. This may result in lower performance.

  Local host: beluga4
  Errno:      2 (No such file or directory)
--------------------------------------------------------------------------

BKitor avatar Jan 01 '22 22:01 BKitor

@BKitor I think you're reporting two things:

  1. You ran into an error trying to use Open MPI's knem support
  2. The error message Open MPI emitted was incorrect

Is that right?

  1. Does /dev/knem exist? If so, can you look in the system logs to see what error occurred when MPI processes tried to open it?
  2. I've filed #9805 to fix the message. Thanks for reporting it!

jsquyres avatar Jan 01 '22 23:01 jsquyres

I'm not really worried about getting knem working at the moment, but I figured you guys would want to know about weird error message.

BKitor avatar Jan 02 '22 00:01 BKitor

Thanks for reporting! Let's leave this open until #9805 and a corresponding PR for the upcoming v5.0.x branch merges. It helps us track / make sure we don't forget stuff.

jsquyres avatar Jan 02 '22 00:01 jsquyres