aiida-core icon indicating copy to clipboard operation
aiida-core copied to clipboard

Define KpointsData via constructor

Open ltalirz opened this issue 4 years ago • 6 comments

It should be possible to set the properties of a KpointsData directly in the constructor. Instead of having to write

kpoints = KpointsData()
kpoints.set_kpoints_mesh([4,4,4])
builder.kpoints = kpoints

one should be able to write

builder.kpoints = KpointsData(mesh=[4,4,4])

ltalirz avatar Aug 29 '19 09:08 ltalirz

I would be all for this. There are multiple open issues related to the usability of the KpointsData class and these discussion have typically led to the conclusion that it badly needs to be redesigned. Also the fact that it is used both for meshes and explicit k-point lists leads to very unexpected behavior in the API. There were ideas of having KpointsData be sub classed into two specific classes for these respective use cases.

sphuber avatar Aug 29 '19 13:08 sphuber

Hi! I was looking at this, and indeed it would be very nice to be able to define this via the constructor. I agree with @sphuber about having the same datastructure for mesh and kpoints can be a bit confusing. However, I think that having two different sub-classes for this might lead to further complications on the implementation of workflows part.

Thinking about the use cases, I think besides the mesh as suggested by @ltalirz , one should see if it is also possible to define via the constructor using the k-points density, i.e. if we pass

structure = StructureData(ase=ase_structure)
kpoints = KpointsData(structure=structure, density=0.1)

Maybe it is a bit of overkill, but that kind of functionality could make aiida easier to use.

What do you think?

JPchico avatar Mar 09 '23 07:03 JPchico

@JPchico I fully agree this would be very useful functionality to have, but in line with my rant in https://github.com/orgs/aiidateam/discussions/5976 [^1] I would avoid adding arguments to the constructor signature. Rather, the constructor should be complete but minimal.

Instead of adding the structure input arguments, I would add a from_structure method:

structure = StructureData(ase=ase_structure)  # <-- This should also be StructureData.from_ase, but that's a fight for another day
kpoints = KpointsData.from_structure(structure=structure, density=0.1)

A few notes:

  1. One question is whether we want to keep track of the logic that converts a StructureData and float / Float into a corresponding KpointsData. In the aiida-quantumespresso plugin package the decision was made to use a calcfunction for this, see the create_kpoints_from_distance. Note, this may be exactly the kind of provenance-purism that convolutes usage and stops people from using AiiDA, but I can see an argument for the KpointsData being connected to the original structure as well. One idea might be to make the from_structure method a calcfunction? Not sure if that's ever been tried, and probably not desirable because that would immediately store the nodes.

  2. With regard to the minimal but complete constructor: this becomes impossible if we want to maintain a single KpointsData class. A KpointsListData class would never need the mesh to be stored, and hence there will always be superfluous inputs. I'm also not sure if I agree that having different sub-classes leads to complications on the usage of KpointsData-like classes. In fact I have the feeling trying to keep one class does so. It leads to a lot of usage with try-except snippets like this [source]:

         try:
             kpoints_list = kpoints.get_kpoints()
         except AttributeError:
             kpoints_list = kpoints.get_kpoints_mesh(print_list=True)
    

    Which I'm frankly not a huge fan of, and also leads to confusion for users. What if it could just be kpoints_list = kpoints.get_kpoints_list(), and the class has different method implementations for dealing with the list?

Point [2] is definitely beyond the original scope of the issue though, so I won't go into more detail and further derail the conversation. I'll add my further comments on the constructor by getting involved in the review of #5942.

[^1]: which interestingly enough doesn't seem to get linked automatically even though I mentioned this issue; -1 for discussions...

mbercx avatar Apr 22 '23 07:04 mbercx

Not sure if that's ever been tried, and probably not desirable because that would immediately store the nodes.

This is definitely possible. You can generate calcfunctions dynamically and run them. See for example Parser.parse_from_node that does this. Note also that you can always pass metadata.store_provenance = False to not record provenance and so not store the nodes.

Which I'm frankly not a huge fan of, and also leads to confusion for users. What if it could just be kpoints_list = kpoints.get_kpoints_list(), and the class has different method implementations for dealing with the list?

This might solve this problem for this example, but don't think this will be applicable in general. I think that having two subclasses makes perfect sense and should be the preferred solution. An interface can decide to support both by accepting the base class KpointsData, and in the implementation use isinstance check on the more specific subclass to see what methods to call. This is perfectly "normal" behavior in code that uses inheritance.

In the aiida-quantumespresso plugin package the decision was made to use a calcfunction for this, see the create_kpoints_from_distance. Note, this may be exactly the kind of provenance-purism that convolutes usage and stops people from using AiiDA, but I can see an argument for the KpointsData being connected to the original structure as well.

I agree that provenance-purism-and-zealotry is to be avoided. In this case, I think the required functionality required a function (since the implementation is not a trivial one-liner) and at that point we thought we might as well just slap the calcfunction decorator on it. I don't think this case in particular is putting people of AiiDA ^^

sphuber avatar Apr 22 '23 14:04 sphuber

Note also that you can always pass metadata.store_provenance = False to not record provenance and so not store the nodes.

True, but that is something most users will only find out about approx 2 years in. 😉 As an avid AiiDA user, I love the idea of the KpointsData.from_structure() classmethod automatically adding this step to the provenance, connecting the StructureData, Float distance and the resulting KpointsData. I'm just worried that new users might do this a bunch of times in a loop and create a lot of nodes they don't need. Of course caching will help here [^1]. In a way, it could highlight a lot of cool features of AiiDA.

I think that having two subclasses makes perfect sense and should be the preferred solution.

I agree, and I think I was arguing for that? 😅 But let's not derail the constructor discussion further, I'll open an issue elsewhere to discuss the atomistic classes redesign. (Maybe I should finally make the repo, but then I will go into the "must update cookiecutter" rabbithole and there is science to be done.

I agree that provenance-purism-and-zealotry is to be avoided. In this case, I think the required functionality required a function (since the implementation is not a trivial one-liner) and at that point we thought we might as well just slap the calcfunction decorator on it. I don't think this case in particular is putting people of AiiDA ^^

Oh, of course, the create_points_from_distance calculation function is probably the one I've run the most, so it is very dear to my heart. I'm quite the purist myself to be honest, except when there is a very good reason not to (e.g. significant improvement in user-friendliness, e.g. the computers and code conundrum).

[^1]: Or having features for cleaning up the database of "lone" nodes. I have so many lonely nodes in my database.

mbercx avatar Apr 22 '23 14:04 mbercx

structure = StructureData(ase=ase_structure)  # <-- This should also be StructureData.from_ase, but that's a fight for another day
kpoints = KpointsData.from_structure(structure=structure, density=0.1)

I think that the solution that you propose @mbercx is ideal, as having the provenance with the density and structure directly when you are creating the kpoints would be a much nicer solution.

Sure having it in the constructor is not the best, but with a nice method like that I think that would solve a lot of the issues when generating kpoints.

JPchico avatar Apr 26 '23 14:04 JPchico