Usability: Improve `ProcessBuilder` API
When working with a ProcessBuilder, it was brought up, e.g., by @npaulish, that interaction with it is not straightforward, which makes finding where to edit what complicated for new users. What is currently possible is (see below):
- printing, but shows only minimal information.
- tab-completion two levels deep, e.g.,
b.pw.structureorb.pw.parametersis possible, but going further inside is not b._port_namespace.get_description()(thanks, @mikibonacci) prints a full list and description (very overwhelming, and not public API) (theget_descriptionis a method of plumpy'sPortNamespaceclass, see here- There exists a
_repr_prettymethod on theProcessBuilderclass here, but the creation of the builder follows a dynamic class creation pattern, and the resulting object is not a plainProcessBuilder(butabc.ProcessBuilder-<uuid>) so it does not implement_repr_pretty(not sure why not...)
The idea is to make the information in this dynamically generated, but explicit entity (i.e., the abc.ProcessBuilder-<uuid> contains the workflow spec) more accessible by adding public methods, and easier ways to explore the structure (and set values, possibly).
The main builder-related methods (and possible ways they could be replaced) are:
get_builder-> Direct instance creation and assignmentget_builder_restart-> ?get_builder_from_protocol->from_protocolalternative constructor of process class
In [9]: b = PwBaseWorkChain.get_builder()
In [32]: type(b)
Out[32]: abc.ProcessBuilder-bc7828aa-6610-4133-aa97-1e90129785e3
In [10]: b
Out[10]:
Process class: PwBaseWorkChain
Inputs:
metadata: {}
pw:
metadata:
options:
stash: {}
monitors: {}
pseudos: {}
In [12]: b. # tab completion
clean_workdir() kpoints_distance metadata
handler_overrides kpoints_force_parity pw
kpoints max_iterations()
In [12]: b.pw. # tab completion
code monitors parent_folder settings
hubbard_file parallelization pseudos structure
metadata parameters remote_folder vdw_table
In [38]: pprint(b._port_namespace.get_description().keys())
dict_keys(['_attrs', 'metadata', 'max_iterations', 'clean_workdir', 'handler_overrides', 'pw', 'kpoints', 'kpoints_distance', 'kpoints_force_parity'])
In [39]: pprint(b._port_namespace.get_description()['pw'].keys())
dict_keys(['_attrs', 'metadata', 'code', 'monitors', 'remote_folder', 'structure', 'parameters', 'settings', 'parent_folder', 'vdw_table', 'pseudos', 'parallelization', 'hubbard_file'])
In [11]: pprint(b._port_namespace.get_description())
{'_attrs': {'default': (),
'dynamic': False,
'help': None,
'required': 'True',
'valid_type': "<class 'aiida.orm.nodes.data.data.Data'>"},
'clean_workdir': {'default': '<function '
'BaseRestartWorkChain.define.<locals>.<lambda> '
'at 0x7f9e04986b00>',
'help': 'If `True`, work directories of all called '
'calculation jobs will be cleaned at the end of '
'execution.',
'is_metadata': 'False',
'name': 'clean_workdir',
'non_db': 'False',
'required': 'False',
'valid_type': "<class 'aiida.orm.nodes.data.bool.Bool'>"},
'handler_overrides': {'help': 'Mapping where keys are process handler names '
'and the values are a dictionary, where each '
'dictionary can define the ``enabled`` and '
'``priority`` key, which can be used to toggle '
'the values set on the original process handler '
'declaration.',
'is_metadata': 'False',
'name': 'handler_overrides',
'non_db': 'False',
'required': 'False',
'valid_type': '(<class '
"'aiida.orm.nodes.data.dict.Dict'>, "
"<class 'NoneType'>)"},
'kpoints': {'help': 'An explicit k-points list or mesh. Either this or '
'`kpoints_distance` has to be provided.',
'is_metadata': 'False',
'name': 'kpoints',
'non_db': 'False',
'required': 'False',
'valid_type': '(<class '
"'aiida.orm.nodes.data.array.kpoints.KpointsData'>, "
"<class 'NoneType'>)"},
'kpoints_distance': {'help': 'The minimum desired distance in 1/Å between '
'k-points in reciprocal space. The explicit '
'k-points will be generated automatically by a '
'calculation function based on the input '
'structure.',
'is_metadata': 'False',
'name': 'kpoints_distance',
'non_db': 'False',
'required': 'False',
'valid_type': '(<class '
"'aiida.orm.nodes.data.float.Float'>, "
"<class 'NoneType'>)"},
'kpoints_force_parity': {'help': 'Optional input when constructing the '
'k-points based on a desired '
'`kpoints_distance`. Setting this to `True` '
'will force the k-point mesh to have an even '
'number of points along each lattice vector '
'except for any non-periodic directions.',
'is_metadata': 'False',
'name': 'kpoints_force_parity',
'non_db': 'False',
'required': 'False',
'valid_type': '(<class '
"'aiida.orm.nodes.data.bool.Bool'>, "
"<class 'NoneType'>)"},
'max_iterations': {'default': '<function '
'BaseRestartWorkChain.define.<locals>.<lambda> '
'at 0x7f9e049871c0>',
'help': 'Maximum number of iterations the work chain will '
'restart the process to finish successfully.',
'is_metadata': 'False',
'name': 'max_iterations',
'non_db': 'False',
'required': 'False',
'valid_type': "<class 'aiida.orm.nodes.data.int.Int'>"},
'metadata': {'_attrs': {'default': (),
'dynamic': False,
'help': None,
'required': 'False',
'valid_type': 'None'},
'call_link_label': {'default': 'CALL',
'help': 'The label to use for the `CALL` '
'link if the process is called by '
'another process.',
'is_metadata': 'True',
'name': 'call_link_label',
'non_db': 'False',
'required': 'False',
'valid_type': "<class 'str'>"},
'description': {'help': 'Description to set on the process node.',
'is_metadata': 'True',
'name': 'description',
'non_db': 'False',
'required': 'False',
'valid_type': "(<class 'str'>, <class "
"'NoneType'>)"},
'disable_cache': {'help': 'Do not consider the cache for this '
'process, ignoring all other caching '
'configuration rules.',
'is_metadata': 'True',
'name': 'disable_cache',
'non_db': 'False',
'required': 'False',
'valid_type': "(<class 'bool'>, <class "
"'NoneType'>)"},
'label': {'help': 'Label to set on the process node.',
'is_metadata': 'True',
'name': 'label',
'non_db': 'False',
'required': 'False',
'valid_type': "(<class 'str'>, <class 'NoneType'>)"},
'store_provenance': {'default': 'True',
'help': 'If set to `False` provenance will '
'not be stored in the database.',
'is_metadata': 'True',
'name': 'store_provenance',
'non_db': 'False',
'required': 'False',
'valid_type': "<class 'bool'>"}},
'pw': {'_attrs': {'default': (),
'dynamic': True,
'help': None,
'required': 'True',
'valid_type': "<class 'aiida.orm.nodes.data.data.Data'>"},
'code': {'help': 'The `Code` to use for this job. This input is '
'required, unless the `remote_folder` input is '
'specified, which means an existing job is being '
'imported and no code will actually be run.',
'is_metadata': 'False',
'name': 'code',
'non_db': 'False',
'required': 'False',
'valid_type': '(<class '
"'aiida.orm.nodes.data.code.abstract.AbstractCode'>, "
"<class 'NoneType'>)"},
'hubbard_file': {'help': 'SinglefileData node containing the output '
'Hubbard parameters from a HpCalculation',
'is_metadata': 'False',
'name': 'hubbard_file',
'non_db': 'False',
'required': 'False',
'valid_type': '(<class '
"'aiida.orm.nodes.data.singlefile.SinglefileData'>, "
"<class 'NoneType'>)"},
...
'vdw_table': {'help': 'Optional van der Waals table contained in a '
'`SinglefileData`.',
'is_metadata': 'False',
'name': 'vdw_table',
'non_db': 'False',
'required': 'False',
'valid_type': '(<class '
"'aiida.orm.nodes.data.singlefile.SinglefileData'>, "
"<class 'NoneType'>)"}}}
Played around a bit with this part of the code in a branch of my fork, and it should be doable: https://github.com/GeigerJ2/aiida-core/tree/process-builder-api-improvements