atomate2
atomate2 copied to clipboard
Should `task_label` always match job name?
This line is causing some problems for me: https://github.com/materialsproject/atomate2/blob/a31b86b1e2f4d1a6665be7764c5cc67673904c91/src/atomate2/vasp/jobs/base.py#L149
This seems to indicate that if the user wants to change the name of a job (ex. adding formulas to the names) it will break subsequent querying of the resulting tasks database.
I like seeing the formula names in the FW web GUI as I'm working on new workflows. Not sure how everyone else feels.
It seems like task_label
has been a little overloaded: it's often used to encode the type of calculation, and also a user-readable name.
Perhaps we just need multiple fields? A label (which is just that, a human-readable label, arbitrary), but also store the maker name as a separate field?
agreed, so I guess this change should happen when the document models move to emmet eh? I linked this in the other thread.
We should tabulate a list of these things and just do the change once.
I'm not sure I see the problem. This is also how its done in atomate1. The calculation type is available through task_type
and calc_type
.
This seems to indicate that if the user wants to change the name of a job (ex. adding formulas to the names) it will break subsequent querying of the resulting tasks database.
What sort of querying are you thinking about? If you add formulas you can always query using regex matching.
On Matt's point. I am also thinking to store the maker/function in the output database. However, it wouldn't be part of the task document itself but one level higher up. E.g., as part of this dict: https://github.com/materialsproject/jobflow/blob/073266cf8a3e9e06abf351a2728a46f159d99f32/src/jobflow/core/job.py#L579-L586
I'd be happy to accept that as a PR :)
What sort of querying are you thinking about? If you add formulas you can always query using regex matching.
I'm thinking about how builders for smaller research projects typically rely on (not very robust) queries of the tasks database. This gets compounded a little bit because as people do more dynamic workflows in automate2 they basically have to do custom job names to navigate their own workflows. And the people building the workflows might now be conscious of the query problems they can cause later and then end with a name that is difficult or even impossible to regex.
@utf so I think the problem here is that self.name
gets modified by things like append_name
which basically indicates that you should change it as part of your workflow. But then gets used by the task document as a stand-in for calc_type
so it's serving two somewhat different functions at the same time which I think is problematic. I think if we assign calc_types
to the different IntputSetGenerator
s and grab that value that should sort everything out automatically right?
On Matt's point. I am also thinking to store the maker/function in the output database. However, it wouldn't be part of the task document itself but one level higher up. E.g., as part of this dict: https://github.com/materialsproject/jobflow/blob/073266cf8a3e9e06abf351a2728a46f159d99f32/src/jobflow/core/job.py#L579-L586
I'd be happy to accept that as a PR :)
I am in need of this feature to navigate some of my workflows. I'd be happy to implement it if we still think this is what we want? @utf
I see this as a different feature to @jmmshn's comments on Feb 15.