pyiron_base icon indicating copy to clipboard operation
pyiron_base copied to clipboard

Is there a particular reason that `write_input` raises `NotImplementedError`?

Open samwaseda opened this issue 3 years ago • 11 comments

https://github.com/pyiron/pyiron_base/blob/0960e3a3f0b14b942cb1ff2e523ed5b040742742/pyiron_base/job/generic.py#L388

I often come across this problem that I derive a new class from AtomisticGenericJob and I forget to set def write_input(self): pass, which raises an error because write_input is part of the run routine. I know that it's also possible to set self._python_only_job = True, but that's not very intuitive either. So in total, I have the feeling that this only hinders new implementation of codes (especially when new people want to implement their codes). Doesn't it make more sense to check if the executable is set and if not we consider it as a python job?

samwaseda avatar Nov 22 '21 20:11 samwaseda

Does you job write output files? If you need an working directory then setting write input to pass is the right solution, if you do not need a working directory then set the job to by python only. Alternatively we could define template Classes which set these options.

jan-janssen avatar Nov 22 '21 20:11 jan-janssen

I'm actually more saying that if there's no executable it's rather unlikely that there are input and output files, right? Doesn't it make more sense to pass it by default if that's the case?

samwaseda avatar Nov 22 '21 20:11 samwaseda

So you mean changing the default to self._python_only_job = True ?

jan-janssen avatar Nov 22 '21 20:11 jan-janssen

I would like this solution. Then we could define a new base class ExternalExecutableJob and define all the methods related to read/write of input files etc. there.

niklassiemer avatar Nov 22 '21 21:11 niklassiemer

I can't suggest a single solution, I just wanted point out the fact that currently I have to make the same mistake over and over again because it's not intuitively clear that I have to set self. _python_only_job = True. In addition to this, I have the feeling that PythonTemplateJob (which from what I can see exists for the same reason) and TemplateJob are nearly redundant. I might a super simply solution and want GenericJob to do what TemplateJob does (ok I admit that this goes beyond the title of this issue).

samwaseda avatar Nov 22 '21 21:11 samwaseda

I would suggest breaking the class hierarchy down a bit and exploiting mixins as we do elsewhere. Right now AtomisticGenericJob (wrongly) assumes that all its children will want to write input (and do a dozen other things I always implement just to get my IDE to stop complaining). This is just a bad assumption on the part of the parent class. Extracting these features to a new mixin like UsesExecutableJobMixin is my best off-the-cuff suggestion.

liamhuber avatar Nov 23 '21 08:11 liamhuber

I would like this solution. Then we could define a new base class ExternalExecutableJob and define all the methods related to read/write of input files etc. there.

Aha, didn't read far/carefully enough. Yes, this. This to proactively introduce these features instead of always having to set the private python-only job to turn them off.

liamhuber avatar Nov 23 '21 08:11 liamhuber

@jan-janssen Is there a particular reason that there's no input and no output in GenericJob? I don't really like this TemplateJob which only adds input and output to GenericJob. Because pyiron is fairly complex, I think it's important to have this simplicity to be able to derive any job from GenericJob and not from TemplateJob.

samwaseda avatar Nov 24 '21 15:11 samwaseda

This is a deep problem. Basically the entire job class hierarchy needs a rewrite, which needs to resolve these not implemented issues and also streamline io treatment, and maybe make the execution flow a little more user friendly. AFAIK one key issue is that GenericJob still has the old GenericParameters embedded in it, and lots of downstream classes rely on that --eg by referencing direct hdf5 paths instead of just class attributes. So fixing this is not trivial. Since the TenoleteJob didn't have any children depending on this behavior, I used it as a testbed for how I want io to look. So for now I think you're stuck inheriting from there

liamhuber avatar Nov 24 '21 15:11 liamhuber

As Liam said the GenericJob is also the foundation for the GenericMaster and consequently the ParallelMaster and InteractiveWrapper - some of these do not even have input and output.

jan-janssen avatar Nov 24 '21 15:11 jan-janssen

AFAIK one key issue is that GenericJob still has the old GenericParameters embedded in it

Yeah I thought of this problem too, but I think having both GenericParameters and DataContainer has been anyway a super dangerous and user-unfriendly business. I think we can now start thinking about solving this underlying problem.

If for some reason GenericParameters is preferred, it'll anyway be overwritten in their respective classes, right? So it shouldn't pose a problem anyway.

GenericJob is also the foundation for the GenericMaster and consequently the ParallelMaster and InteractiveWrapper - some of these do not even have input and output.

But is there any job that's derived from GenericMaster which doesn't have any input and output? Even if yes, are there any of them which would have a practical/conceptual problem if they have input & output?

samwaseda avatar Nov 24 '21 15:11 samwaseda