👌 IMPROVE: Module: Writing workflows - Basics
General note: Make any changes by opening a PR into the tutorial-2021-intro-week branch. I'm preparing a wiki page with more information on MyST Markdown, but this is still a work in progress.
This module is part of the Workflows section, which currently has three modules:
- Running workflows
- Writing workflows - Basics
- Error handling
This module still needs quite a bit work in case we want to make it easier for participants to get started with writing workflows. @flavianojs maybe it's good that we first touch base on this before you start working on it? I'll already leave some notes below to get this started:
Code-agnostic modules
Since the content in this tutorial is already challenging enough as it is, I would stick to easy code-agnostic examples. This has the added benefit that in case we write a good code-agnostic tutorial on how to write workflows, it can also be re-used by other plugin developers.
Work functions
The tutorial now starts with work functions, as well as a short revision of calculation functions. Maybe leaving in the work functions is fine to start simple, but of course the jump to work chains is still massive. I think we can remove the calculation function part since it's already covered in the basics section.
Work chains
The biggest challenge will be to step by step explain the concepts for the work chain. I think this needs to really start with a super basic example, almost minimal to have a working work chain. Step by step the participants can then add more to the work chain they are developing, building towards the MultiplyAddWorkChain without the exit code. As described here, the tutorial should not explain too much, but rather have them learn in little steps by always expanding the work chain a little. I'd have to work on this a bit to be sure what these steps should be, but just off the top of my head:
- How to write a define method, i.e.:
- basics - using
super(). - Defining an input via the
spec. Thevalid_typeargument. - Defining the outline via the
spec. Writing the outline steps as methods of the class. - Defining an output via the
spec. - Accessing an input inside the work chain.
This would be the minimal example (untested):
class MultiplyAddWorkChain(WorkChain):
"""WorkChain to multiply two numbers and add a third, for testing and demonstration purposes."""
@classmethod
def define(cls, spec):
"""Specify inputs and outputs."""
super().define(spec)
spec.input("x", valid_type=Int)
spec.outline(
cls.result,
)
spec.output("result", valid_type=Int)
def result(self):
"""Add the result to the outputs."""
self.out("result", self.inputs.x)
- How to use the context:
- Pass variables between steps in the outline using the context.
- Add another step to the linear outline.
- Note how the output has to be a node, but this is true because multiplying two
Intnodes gives andIntnode.
class MultiplyAddWorkChain(WorkChain):
"""WorkChain to multiply two numbers and add a third, for testing and demonstration purposes."""
@classmethod
def define(cls, spec):
"""Specify inputs and outputs."""
super().define(spec)
spec.input("x", valid_type=Int)
spec.input("y", valid_type=Int)
spec.outline(
cls.multiply,
cls.result,
)
spec.output("result", valid_type=Int)
def multiply(self):
"""Multiply two integers."""
self.ctx.product = self.inputs.x * self.inputs.y
def result(self):
"""Add the result to the outputs."""
self.out("result", self.ctx.product)
- Use a calcfunction
- Add the
calcfuntion-decoratedmultiplyfunction. to the script - Use it inside the
multiplystep of the outline.
@calcfunction
def multiply(x, y):
return x * y
class MultiplyAddWorkChain(WorkChain):
"""WorkChain to multiply two numbers and add a third, for testing and demonstration purposes."""
@classmethod
def define(cls, spec):
"""Specify inputs and outputs."""
super().define(spec)
spec.input("x", valid_type=Int)
spec.input("y", valid_type=Int)
spec.outline(
cls.multiply,
cls.result,
)
spec.output("result", valid_type=Int)
def multiply(self):
"""Multiply two integers."""
self.ctx.product = multiply(self.inputs.x, self.inputs.y)
def result(self):
"""Add the result to the outputs."""
self.out("result", self.ctx.product)
- Using a
CalcJob
- Adding
Codeinput. - How to submit a
CalcJobinside a work chain. They should know how to run aCalcJobby now. - How to use the
ToContextto pass theCalcJobnode. (probably shouldn't usefuturehere anymore, it's a little weird) - Obtain the outputs from the
CalcJob. Note that the argument we passed toToContextcorresponds to the attribute of thectxAttributeDict.
ArithmeticAddCalculation = CalculationFactory("arithmetic.add")
@calcfunction
def multiply(x, y):
return x * y
class MultiplyAddWorkChain(WorkChain):
"""WorkChain to multiply two numbers and add a third, for testing and demonstration purposes."""
@classmethod
def define(cls, spec):
"""Specify inputs and outputs."""
super().define(spec)
spec.input("x", valid_type=Int)
spec.input("y", valid_type=Int)
spec.input("z", valid_type=Int)
spec.input("code", valid_type=Code)
spec.outline(
cls.multiply,
cls.add,
cls.result,
)
spec.output("result", valid_type=Int)
def multiply(self):
"""Multiply two integers."""
self.ctx.product = multiply(self.inputs.x, self.inputs.y)
def add(self):
"""Add two numbers using the `ArithmeticAddCalculation` calculation job plugin."""
inputs = {"x": self.ctx.product, "y": self.inputs.z, "code": self.inputs.code}
future = self.submit(ArithmeticAddCalculation, **inputs)
return ToContext(addition=future)
def result(self):
"""Add the result to the outputs."""
self.out("result", self.ctx.addition.outputs.sum)
And I think that's it! The error handling will be explained in the next module.
Seeing the work chain change.
I think it would be good that the participants execute the work chain in each step and see how it changes. We can use the RESTAPI/provenance graph or the verdi process status method. I think we should try this and experiment.
Debugging issues.
It would be ideal if the participants make these changes themselves. However, this means they will also have to be able to debug issues they encounter when running the work chain. This includes using the verdi process status method and verdi process report, and also checking the daemon logs? These are essential skills they need to know if they ever want to write workflows, and will actually help them learn by making mistakes.
verdi daemon restart --reset
This command can almost not appear too much in the tutorial. 🙃 We should also explain that the daemon needs to have access to the workflow they are writing by adjusting the $PYTHON_PATH in the activate script of the environment.
I think that'll do to kickstart writing this module. ;) The equation of state should not be in this module. It should really be all about the basics of writing a workflow as described above. Future modules will deal with:
- Error handling
- input validation
- non-linear outlines
- ...
@flavianojs don't hesitate to remove material if you think it doesn't fit in this module. I'll be keeping a close eye on removed content and see if I can find a home for it.
Thanks man! Let's make sure people can write top-notch workflows in just the one day we have available! I consider this to be the most critical part of the tutorial, but perhaps also the most challenging to write.
Some notes from a meeting between @ramirezfranciscof, @flavianojs and myself:
- Move "Running workflows" module to "Running calculations" section. @ramirezfranciscof made a very vaild point that it's a little strange to suddenly still do this in the Workflows section, just because I was so focused on splitting
CalcJobs andWorkChains. Instead the separation between sections will be "Running calculations" and "Writing workflows". - Move all
calcfunction/workfunctioncontent to separate module, that will come first in the "Writing workflows" section. Again, the focus of this section is writing, so thecalcfunctionhas its place here. - For the step-by-step procedure above, we would still give them the files so there is little chance of issues to debug. This will come later. Here the goal is to have them execute the examples and understand the influence of the changes.
- Showing a more scientific example. Once the participant has completed the basic section based on the
MultiplyAddWorkChain, we should show an example of a scientific work chain using Quantum ESPRESSO. The full work chains inaiida-quantumespressoare pretty extensive, but maybe we can show part of them and explain the concepts. - Then the users get to work themselves, from scratch. We need to have a simple example of a work chain that they can write themselves. One idea was to have them write a convergence work chain for
pw.x. @ltalirz I was thinking that this could be the option during the hands-on you will guide. I.e. either they learn how to interface a code via a plugin, or they work on writing their own work chain (for example if they just want to run Quantum ESPRESSO anyways). @espenfl I would also consider here to ask the participants if they maybe want to write the work chain in VASP instead. Since I have very little experience with this, would you be willing to help during this hands-on? It would be "Interfacing external codes and writing plugins" in the tentative schedule. As a side note, I intentionally wanted to make the basic writing workflow section code agnostic so you could reuse it. Does that make sense to you, i.e. do you think it will be useful? - A smaller comment is to make sure to replace
futurebycalcjob_nodein step 4. Future gives the impression thatself.submitreturns something different (and magical?), but it's just the boring old calcjob node they have also seen when usingsubmitin the IPython shell.
- One idea was to have them write a convergence work chain for
pw.x. @ltalirz I was thinking that this could be the option during the hands-on you will guide. I.e. either they learn how to interface a code via a plugin, or they work on writing their own work chain (for example if they just want to run Quantum ESPRESSO anyways)
Hmm so you are saying that the two workflow sessions on Wednesday are probably not enough? I agree that probably a significant number of attendees may end up not needing to write their own plugins; at the same time it can be useful to know how AiiDA actually interacts with a simulation code.
I'm open to this solution but I would say that in this case we should split into two parallel zoom sessions since one topic has rather little to do with the other. Also you'll need to find a way of explaining participants which topic they should pick. If I was taking the tutorial as a new AiiDA user, I can imagine it would be quite difficult for me to choose between writing a convergence workflow or learning about how plugins work. Let me know.
Thanks for the comments, @ltalirz! I agree that it might be useful for them to know how AiiDA interacts with a code, but I feel the focus should be on teaching them how to do things with AiiDA that they want to do. We can explain how to write a work chain in two hands-ons. But to really get them started I think some time spent writing their own workflow from scratch with the opportunity to ask questions would be invaluable.
However, so far we don't have that many participants who have submitted a poster, so we might swap that two-hour time slot for a session where they can write their own workflow. (Or maybe continue working on the plugin for option B; or write a data plugin - I think we can be quite flexible here.).
As I'm sure you know, I think it's "Mission critical" (to borrow a phrase ;) ) that users can write workflows after doing the long tutorial. I think adding some time to do this from scratch and running into all sorts of problems will make it much more likely that they will continue writing workflows afterwards.
Finally, we already have quite a few plugins. However, most of these only have a few workflows... With the aiida-quantumespresso plugin, one of the most supported ones, you can only calculate the geometry, band structure and PDOS. So if the users want to do anything else for their research, they need to be able to write workflows. 🙃
Since you already have a dedicated time slot for the mysterious "virtual social event", I think skipping the poster session is very reasonable. Even if you had lots of submissions, I think it is very difficult to make such a poster session interesting to participants. What unites them is not their research interests but their curiosity about AiiDA, which is likely not going to manifest a lot in the posters (or does it?).
So, if I understand correctly, then the plan is to keep the plugin introduction session as is, and then have the poster session slot be taken over by writing your own workflow.
In terms of the flow, I guess it would make sense to start with the workflows in the morning and do the plugin introduction afterwards, but we can also leave the order as it is.
Even if you had lots of submissions, I think it is very difficult to make such a poster session interesting to participants. What unites them is not their research interests but their curiosity about AiiDA, which is likely not going to manifest a lot in the posters (or does it?).
Indeed, this is also the conclusion I came to, after talking with one of my friends from the UA who is participating. The time is better spent making sure they both understand how AiiDA interacts with codes, and have experience with writing workflows independently. A poster session would make more sense for an AiiDA users meeting, not for an introductory tutorial.
In terms of the flow, I guess it would make sense to start with the workflows in the morning and do the plugin introduction afterwards, but we can also leave the order as it is.
Indeed, but there is a small difference in the length of the session(1h30 morning, 2h afternoon). I think 1h30m is also fine for writing workflows independently, if you think 2h is fine for the plugin section?
My two cents: I think it's absolutely a good idea to teach people how to write workflows from scratch, since 90% of the random stuff that people want to do will not be already implemented in a nice predefined workchain in a plugin package. From my own experience this is one of the biggest friction hurdles to overcome as a novice, i.e. where to start, what to do when things go wrong, where to find info on why things didn't work as one was hoping etc. Bearing that in mind it may be a good idea to explicitly make some mistakes and show how a veteran aiida dev would go about the debugging process of that.
Indeed, but there is a small difference in the length of the session(1h30 morning, 2h afternoon). I think 1h30m is also fine for writing workflows independently, if you think 2h is fine for the plugin section?
I can work with both. 1h30min is what we had last time.
Thanks for your input @louisponet!
Bearing that in mind it may be a good idea to explicitly make some mistakes and show how a veteran aiida dev would go about the debugging process of that.
Exactly! That's also what we were thinking: a work chain debugging module (which apparently I failed to put in the meeting notes, sorry @ramirezfranciscof and @flavianojs! 😅 ). This'll be a great addition to the tutorial.
I can work with both. 1h30min is what we had last time.
👍 In that case, let's keep the plugin session to 1h30, and use the afternoon session for writing workflows independently. I think understanding how AiiDA interacts with a code might also help with debugging issues.
Since you already have a dedicated time slot for the mysterious "virtual social event", I think skipping the poster session is very reasonable. Even if you had lots of submissions, I think it is very difficult to make such a poster session interesting to participants. What unites them is not their research interests but their curiosity about AiiDA, which is likely not going to manifest a lot in the posters (or does it?).
I support this fully. Also, it gives the folks presenting the posters the feel that nobody is interested. They probably are, but more in AiIDA and related concept at this particular event.
My two cents: I think it's absolutely a good idea to teach people how to write workflows from scratch, since 90% of the random stuff that people want to do will not be already implemented in a nice predefined workchain in a plugin package. From my own experience this is one of the biggest friction hurdles to overcome as a novice, i.e. where to start, what to do when things go wrong, where to find info on why things didn't work as one was hoping etc. Bearing that in mind it may be a good idea to explicitly make some mistakes and show how a veteran aiida dev would go about the debugging process of that.
Yes, I agree on this. Also, now that aiida_core becomes more and more stable, including many of its plugins, we need to work on the workflow side of things. If the attending users also would not only possess knowledge, but something additional, best case a working workflow for their problem when they get back to their daily routines, that would be awesome. Not many events can offer such a significant take home package so maybe that is something to consider. From the VASP side I think we could facilitate this. Maybe this work could also take part of the plugin package or some other repository such that "no time is wasted".
Closing because IDK why github is not recognizing and autoclosing (maybe some setups in the repo missing?).
Feel free to re-open if I'm mistaken.
@louisponet will still give a pass at the latest version. Feel free to re-open in case you think there are still changes to be made!
I just did a check, especially near the end many things remain. This is a list of comments I had while reading through it:
- [ ] why is it important to have a spec defined, how will it be used
- [ ] there is some asymmetry in self.inputs, but not assigning things straight to self.outputs. Maybe indicate that the reason is to make absolutely sure that the assignment has the correct type.
- [x] Some TODOs remain
- [ ] demonstrate how to see the run and process status but from within the python shell rather than using verdi cli
- [ ] There might be some questions why the calcfunction is necessary, since the workchain has all steps and functions defined, why is the additional calcfunction needed when it is known that anything that is given to self.out() should be stored anyway.
- [ ] Maybe add a line that the
context/ctxis a kind of shared scope/dictionary that is a standard attribute in each workchain. That makes it clear to people that are familiar with classes - [ ] I think somewhere it should be highlighted that a Workchain is never really instantiated in the current repl when submitting, rather it is a kind of blueprint for the daemon to understand what to do/run, how to lay the pipes and connect everything in a well-defined manner
- [ ] Remote code, just mention remote DFT or FEM code
- [ ] Documentation of CalcJob is pretty lacking to be honest. I don't know if it's introduced before the workchain tutorial but if not people are going to wonder why not just use a calcjob that does many things instead of the workchain. The difference will not be very obvious since the tutorial says "allows to submit many processes independently while the workchain waits for them to finish". It may not be obvious what is meant because the workchain will not continue through it's steps until the CalcJob is finished, similar to the CalcFunction way.
- [ ] code input is wrong, it's the y input as written. It is not clear why one would need to be able to specify a different code to run with the same CalcJob, doesn't each code have it's own CalcJob? This is only semi clear if the CalcJob were to run external code where one should be able to specify where to find the executable, mention this.
- [ ] "When submitting a calculation job or work chain inside a work chain, it is essential to use the submit method of the work chain via self.submit()."--Why?
- [ ] The following is also not particularly clear. Would be clearer if it is made absolutely concrete that a workchain does not DO anything, it defines HOW to do a set of steps that the daemon will run through.
- [ ] Workchain executes -> daemon executes next part of workchain. I think this language makes things much more clear.
- [ ] Also, I think there's a mistake in which context key is used.
- [ ] This ToContext situation is very specific to how AiiDA internals work, I think it might require a bit more explanation of why etc. Why not just do self.ctx.add_node =
, since the tutorial states that the workchain will be waiting for it to finish anyway. That's because it's not true, the daemon will sporadically check whether it was finished, and only continue with the workchain if it's done. This is important. The tutorial feels like it suddenly ramped up in complexity a lot, some gentler/longer intro and explanation of some concepts would be better. - [ ] "The outputs of the calculation job are stored in the outputs attribute of the calculation job node." Add: In this case
<the actual name of the link> - [x] There's a #declaring the output, without actually code.
- [ ] Full code snippet
- [x] TODO fix this path
- [ ] If it was clearly explained that a workchain doesn't do anything, it's just a blueprint, it becomes also clearer why the daemon needs to be restarted etc
- [x]
activatescript? - [ ] Also double check if add code is set up, missing link
- [ ] Still weird why there's the addworkchain and add code, but I think it's just how it is and no way around, just communicate well that it's intended for cases where the code is not just python code
- [ ] Last exercises don't have results
Thanks, @louisponet! Will re-open so we don't lose track of these. @demiranda-gabriel also still sent me some notes I have to go through.
Will try to still integrate as much as possible tomorrow, but some points will probably be for after the tutorial.