ompi
ompi copied to clipboard
pmix: remove the MCA framework
pmix is now a first class citizen and hence do not need to be part of a MCA framework anymore
Refs. open-mpi/ompi#12282
I am not fully happy with this PR.
Since there is no more pmix
framework, the pmix
MCA parameters do not appear any more in ompi_info
.
@jsquyres @bosilca could you please advise on how to move forward?
Which params are you concerned about? I think there were a couple of params in the pmix/base - are those the ones?
If so, what I did in PRRTE was to add a param registration function in your opal/pmix/pmix.c
and then just add that to the ompi_rte function. You can also add it to ompi_info
so they display.
If you mean the PMIx internal params, then that's been an ongoing discussion. I forget the OMPI issue number, but basically we had agreed on a method but nobody had time to implement it.
The pmix
MCA param would not make sense, and you already moved pmix_base_async_modex
, pmix_base_collect_data
and pmix_base_exchange_timeout
. While leaves a single, somewhat important MCA param out, the pmix_base_verbose
. But as we already have opal_pmix_verbose_output
this shall not an issue.
After this PR, the following bits have disappeared from ompi_info --all
My concern is if we cannot fix that, that could be a deal breaker for v5
.
I will investigate Ralph's suggestion.
MCA pmix base: ---------------------------------------------------
MCA pmix base: parameter "pmix" (current value: "", data source: default, level: 2 user/detail, type: string)
Default selection set of components for the pmix framework (<none> means use all components that can be found)
MCA pmix base: ---------------------------------------------------
MCA pmix base: parameter "pmix_base_verbose" (current value: "error", data source: default, level: 8 dev/detail, type: int)
Verbosity level for the pmix framework (default: 0)
Valid values: -1|none, 0|error, 10|component, 20|warn, 40|info, 60|trace, 80|debug, 100|max, 0-100
MCA pmix base: parameter "pmix_base_async_modex" (current value: "false", data source: default, level: 9 dev/all, type: bool)
Use asynchronous modex mode
Valid values: 0|f|false|disabled|no|n, 1|t|true|enabled|yes|y
MCA pmix base: parameter "pmix_base_collect_data" (current value: "true", data source: default, level: 9 dev/all, type: bool)
Collect all data during modex
Valid values: 0|f|false|disabled|no|n, 1|t|true|enabled|yes|y
MCA pmix base: parameter "pmix_base_exchange_timeout" (current value: "-1", data source: default, level: 3 user/all, type: int)
Time (in seconds) to wait for a data exchange to complete
@ggouaillardet Would you like me to fix the param issue for you? Having done it before, it should only take a few minutes to port it here.
I came up with this (plus a missing trivial fix)
diff --git a/opal/mca/common/ofi/common_ofi.c b/opal/mca/common/ofi/common_ofi.c
index d39412f..ed63c95 100644
--- a/opal/mca/common/ofi/common_ofi.c
+++ b/opal/mca/common/ofi/common_ofi.c
@@ -34,7 +34,6 @@
#include "opal/mca/base/mca_base_var.h"
#include "opal/mca/hwloc/base/base.h"
#include "opal/mca/memory/base/base.h"
-#include "opal/mca/pmix/base/base.h"
#include "opal/util/argv.h"
#include "opal/util/show_help.h"
diff --git a/opal/runtime/opal_info_support.c b/opal/runtime/opal_info_support.c
index e167451..437163f 100644
--- a/opal/runtime/opal_info_support.c
+++ b/opal/runtime/opal_info_support.c
@@ -340,6 +340,7 @@ void opal_info_register_types(opal_pointer_array_t *mca_types)
for (i = 0; NULL != opal_frameworks[i]; i++) {
opal_pointer_array_add(mca_types, opal_frameworks[i]->framework_name);
}
+ opal_pointer_array_add(mca_types, "pmix");
}
int opal_info_register_framework_params(opal_pointer_array_t *component_map)
If there is a better way, feel free to suggest or push in this PR directly.
I still believe we should find a way to keep pmix_base_verbose
param, at least for the v5
series.
Looks like this does the trick
diff --git a/opal/mca/common/ofi/common_ofi.c b/opal/mca/common/ofi/common_ofi.c
index d39412f..ed63c95 100644
--- a/opal/mca/common/ofi/common_ofi.c
+++ b/opal/mca/common/ofi/common_ofi.c
@@ -34,7 +34,6 @@
#include "opal/mca/base/mca_base_var.h"
#include "opal/mca/hwloc/base/base.h"
#include "opal/mca/memory/base/base.h"
-#include "opal/mca/pmix/base/base.h"
#include "opal/util/argv.h"
#include "opal/util/show_help.h"
diff --git a/opal/pmix/pmix-internal.h b/opal/pmix/pmix-internal.h
index 1a8eefe..46d6d18 100644
--- a/opal/pmix/pmix-internal.h
+++ b/opal/pmix/pmix-internal.h
@@ -66,6 +66,7 @@ typedef struct {
OPAL_DECLSPEC extern bool opal_pmix_collect_all_data;
OPAL_DECLSPEC extern bool opal_pmix_base_async_modex;
OPAL_DECLSPEC extern int opal_pmix_verbose_output;
+OPAL_DECLSPEC extern int opal_pmix_verbose;
/* define a caddy for pointing to pmix_info_t that
* are to be included in an answer */
diff --git a/opal/pmix/pmix.c b/opal/pmix/pmix.c
index 34900ca..c29fdc6 100644
--- a/opal/pmix/pmix.c
+++ b/opal/pmix/pmix.c
@@ -714,6 +714,7 @@ OBJ_CLASS_INSTANCE(opal_proclist_t, opal_list_item_t, NULL, NULL);
bool opal_pmix_collect_all_data = true;
int opal_pmix_verbose_output = -1;
+int opal_pmix_verbose = 0;
bool opal_pmix_base_async_modex = false;
opal_pmix_base_t opal_pmix_base = {.evbase = NULL,
.timeout = 0,
diff --git a/opal/runtime/opal_info_support.c b/opal/runtime/opal_info_support.c
index e167451..437163f 100644
--- a/opal/runtime/opal_info_support.c
+++ b/opal/runtime/opal_info_support.c
@@ -340,6 +340,7 @@ void opal_info_register_types(opal_pointer_array_t *mca_types)
for (i = 0; NULL != opal_frameworks[i]; i++) {
opal_pointer_array_add(mca_types, opal_frameworks[i]->framework_name);
}
+ opal_pointer_array_add(mca_types, "pmix");
}
int opal_info_register_framework_params(opal_pointer_array_t *component_map)
diff --git a/opal/runtime/opal_params.c b/opal/runtime/opal_params.c
index 5a9e7e2..ac05809 100644
--- a/opal/runtime/opal_params.c
+++ b/opal/runtime/opal_params.c
@@ -109,6 +109,18 @@ int opal_register_params(void)
MCA_BASE_VAR_TYPE_INT, NULL, 0, 0, OPAL_INFO_LVL_3,
MCA_BASE_VAR_SCOPE_READONLY, &opal_pmix_base.timeout);
+ opal_pmix_verbose = 0;
+ (void) mca_base_var_register("opal", "pmix", "base", "verbose",
+ "Verbosity level for the pmix framework (default: 0)",
+ MCA_BASE_VAR_TYPE_INT,
+ &mca_base_var_enum_verbose, 0,
+ MCA_BASE_VAR_FLAG_SETTABLE, OPAL_INFO_LVL_8,
+ MCA_BASE_VAR_SCOPE_LOCAL,
+ &opal_pmix_verbose);
+ if (0 < opal_pmix_verbose) {
+ opal_pmix_verbose_output = opal_output_open(NULL);
+ }
+
opal_finalize_register_cleanup(opal_deregister_params);
return OPAL_SUCCESS;
Yep, that's all there is to it for the params. However, I don't think this patch is entirely correct. You don't want to declare this still to be a framework by calling MCA_BASE_FRAMEWORK_DECLARE
. I just added a simple pmix_base_init
function and have opal_init
call it to setup the "base" object and register the params.