pisa
pisa copied to clipboard
Let each `Stage` transparently declare expected and new container keys upon initialisation
Currently, it is not exactly obvious to the user which sequence of stage.services can be combined into one working pipeline configuration without digging through a lot of code.
While each Stage
has an expected_params
attribute in order to communicate which params need to be defined, there is no analogous mechanism for the data fields/container keys based on which a given service makes its computations, and, e.g., modifies weights. These are generally not even documented, in contrast to expected_params
. Without all of expected_params
for each stage.service, it won't initialise, and neither will a pipeline containing it.
As an example (just one of many), take osc.prob3, which requires each container to contain the keys "true_coszen"
and "true_energy"
. As of now, this is completely undocumented, and it is up to the service itself to deal with missing keys. However, shouldn't checking for missing ones be generically performed by the base Stage
instead (compare expected params check)?
A simple example is utils.fix_error, which requires the presence of "errors"
during compute
(undocumented, no exception handling beyond that in Container
itself).
Of course, while typically a given data loader, during setup
, populates (most) keys required by any possible later stage.service, other services may also add keys on which some subsequent service depends, and not necessarily during setup
(for instance, utils.hist
adds the previously mentioned "errors"
key upon apply
).
If a given service declared any added/new keys during initialisation, however, on the one hand it would become apparent that utils.fix_error
could follow utils.hist
, for example, and on the other we should be able to detect an inconsistent pipeline config upon pipeline initialisation already.
Through the above, it should also in principle become possible to create a default unit test with no additional work for anyone implementing a service, by producing dummy inputs with the required keys (perhaps combined with some rudimentary detection of the domain of definition for each required key, e.g., energy >= 0 GeV, or -1 <= coszen <= 1).
Status of development in https://github.com/thehrh/pisa-1/tree/stage_declare_expected_keys.
Would also close #811