ompi
ompi copied to clipboard
v5.0.x: mca: update "show_load_errors" behavior
Convert the MCA parameter "opal_mca_base_component_show_load_errors" to be a flexible mechanism to specify when (and when not) to emit warnings about errors when trying to load DSO components.
-
Convert the existing MCA parameter opal_mca_base_component_show_load_errors from a boolean to a string. It will still accept all prior valid boolean values, but it will also accept comma-delimited list of "framework[/component]" tokens. If the MCA base encounters an error when loading a DSO, opal_mca_base_component_show_load_errors is checked to see if a warning should be emitted.
- If the value is boolean true or the string "all", then emit a warning
- If the value is boolean false or the string "none", then do not emit a warning
- If the value is a comma-delimited list of tokens: emit a warning about any dynamic component that fails to open and matches a token in the list. "Match" is defined as:
- If a token in the list is only a framework name, then any component in that framework will match.
- If a token in the list specifies both a framework name and a component name (in the form
framework/component), then only the specified component in the specified framework will match.
- The value can also be a "^" character followed by a comma-delimited list of "framework[/component]" values: This is similar to the comma-delimited list of tokens, except it will only emit warnings about dynamic components that fail to load and do not match a token in the list.
NOTE: The equivalence of "all" with boolean true values, and "none" with boolean false values is only intended as a backwards compatibility mechanism, since prior to this commit, opal_mca_base_component_show_load_errors was a boolean value. It is not intended as a general mechanism that should be copied to all other include/exclude-type MCA params.
-
Remove the configure option --enable-show-load-errors-by-default, replace it with --with-show-load-errors[=value]. The value specified will become the default value of the opal_mca_base_component_show_load_errors MCA variable (it defaults to "all").
The CLI option name change is intentional. The previous MCA parameter only accepted boolean values; the new CLI name reflects that it can accept more than just boolean values.
The rationale for this commit is to allow packagers more granular control over whether to warn about component DSO load failures or not.
The canonical example of where this is useful is accelerator libraries: since accelerators are expensive, they may only be available on a subset of nodes in a given HPC environment. Consequently, the accelerator's support libraries may only be loaded on the nodes that actually have accelerators physically present. In such an environment, an administrator or packager may wish to configure Open MPI:
- With accelerator components built as DSOs.
- Do not warn about about accelerator DSO component load failures.
For example:
./configure --enable-mca-dso=accelerator ...
make install
mpirun --mca opal_mca_base_component_show_load_errors '^accelerator' ...
Signed-off-by: Jeff Squyres [email protected] (cherry picked from commit 20bbf2709af2b64be87add30c12aeb5507e6b24b)
This is the v5.0.x PR corresponding to the main PR #10763