sisl
sisl copied to clipboard
Change neighborfinder returns to a tuple of indices and supercell indices
Describe the feature I think this would be more natural.
instead of [io, jo, a, b, c]
, it should return [io, jo], [a, b, c]
Hmm but wouldn't it be ambiguous, because the neighbor of ia
is not fully defined by ja
? You need the supercell indices to have the connection fully described.
Unless of course there are no periodic boundary conditions.
But then, probably jo
should just be in a supercell index?
The way I see it the finder should just find. If someone later wants to convert the indices according to a given auxiliary cell and a given ordering convention they can do it. But in my opinion it should not be the responsability of the finder.
Supercell indices are mostly worthless without an auxiliary cell size and an ordering convention, so I don't like returning them as the "raw" return. E.g. if someone wants to just use the finder (not the rest of sisl), they will likely prefer supercell shifts.
In my opinion it would be better to keep the raw return and maybe implement wrappers that convert to supercell indices. E.g. if the finder machinery is used in Geometry.close
.
Agreed, so it should return jo + no * index_of_supercell
, no?
I don't know what you agreed to, but to me it doesn't look like we agree :sweat_smile:
I don't know what you agreed to, but to me it doesn't look like we agree 😅
geom.close(ia)
by default returns ja
which is a supercell index. That was my understanding that you agreed to?
Ah ok, yes on Geometry.close
I think keeping the current behavior is the way to go!
But I also think it should be here as well, why not?
From the supercell coupling it is very easy to convert to orbital couplings, and even getting the isc from a single number would be trivial.
So neighbor lists should return supercell indices.
And what should the finder do, return nsc
as well? If the neighbors are requested only for a given atom, how can the finder know nsc
? nsc
in SIESTA is not based just on orbital couplings, you also have the projectors. I think is not the finder's job to build an auxiliary cell.
If someone uses just the finder because is efficient (but not sisl.Geometry
, sisl.Hamiltonian
, etc ...), they will surely want the supercell shifts, not the atomic supercell indices with sisl's convention, which they probably will not understand.
well, the entirety of sisl
is based on the nsc
concept, and it may be used to gauge the range of couplings.
So I don't necessarily agree, while I can see both arguments. Converting from supercell indices to isc is quite trivial:
isc = geom.a2isc(jo)
done...
I think this is more of how should we encourage sisl to be used. The finder
is used on a Geometry
and this should hold the nsc
information, so it is intrinsically known for the geometry.
Just as close(ret_isc)
, we could have that in the finder
, but I think generally we should work in supercell indices.
So nsc
should be determined before using the finder? And the finder should not return neighbors outside that auxiliary cell?
I think both things can coexist. There can be a "raw" interface and a wrapped interface for Geometry
. But I think that supercell indices can be obscure in some cases, so I like that the finder raw returns are as transparent as possible. Some things that I don't like of being forced to get the supercell indices:
- The natural thing that you get when finding neighbors is the supercell shifts. So if that is what you want you would be converting atoms to supercell indices to then recalculate the shifts, which you had for free in the first place.
- If you hold an array with neighbors and the auxiliary cell changes, suddenly your neighbors are wrong. So the array of neighbors should have an
nsc
associated somehow to check compatibility in some cases where you can't be sure that the geometry's auxiliary cell has not changed. - The auxiliary cell is not a property uniquely defined by the
Geometry
. In SIESTA it depends on the KB projectors, for example. It is hard to keep track sometimes. E.g. if you are using a geometry read by SIESTA and a sisl created one at the same time. - A user that is new to sisl will be confused by the returns of the finder. Maybe a not so new user too, since this is a concept that is not the first thing you learn.
Probably most uses of the finder will be through the already present interfaces. The only thing that people will notice is that it is faster. If somebody wants to use the finder directly, they can access the raw data directly. I would say it would be more like a low level interface. We could also allow the finder to return both things depending on what the user wants.
I can partially see where you are coming from, but then you suggest an underlying change of how sisl shows interactions.
I am not against this, but I think it should be returned consistently then.
-
I would suggest we create an
Interaction
object that holds the information:- the parent it originated from
- the
ia, ja
couplings - the corresponding
isc
arrays (same length asia, ja
couplings) - the ability to turn off the
geometry.nsc
reduction. I would see the manualnsc
specification for ageometry
as a way to quickly turn of the couplings, it is a trick, but I think it is quite valuable.nsc
could be defaulted to be unspecified which would allow the finder to create it, but, it must be defined before a sparse matrix is created.
The other good thing about the object, is that we can always add more data, without it disrupting codes.
-
geometry.close
should be changed to do this, ideally it can behave like a namedtuple of some sorts. Or, it depends. -
We should be careful in how this hurts performance, is this really overkill? As you say you can easily loose track of when a neighbor is consistent with the current Hamiltonian.
-
Since this is a breaking change we should consider whether we would allow using the old interface for some time, it is so ingrained in many tutorials that it might be difficult to force a transitioning.
I like the idea that the finder returns this class!
But do you think that Geometry.close
needs to be changed? This can be something that the finder returns and then internally in Geometry.close
it is converted to keep the API the same as it is now.
But isn't your argument that close
returns also in supercell indices, which is a bit counter-intuitive?
I think it would be nice if we could streamline this, no?
Hmm but changing this in close
would be chaos for everyone, no? :sweat_smile:
Also I think there is fine, because it's something more tied to the Geometry
class. I see the neighbour finder being used in situations where the user is just looking for the most efficient way to get all neighbors . E.g. to build graphs for ML.
You are probably right. Then I think we should split the return values?
It is more clearer that isc[:, 0]
refers to offsets along the first lattice vector.
I liked the custom class thing, because there the user explicitly gets what they want and there is no confusion, while you can still easily get the neighbors in supercell indices if that's what you want. Also we can further extend it with other information like distances, etc.
Do you now prefer the tuple ([ia, ja], [isc_0, isc_1, isc_2])
?
Custom class is fine for me ;)
Cool, I think it's the best approach seeing that there might be different users interested in different things. And also it makes sure that the user knows what they are using.
I guess it could also be an xarray.Dataset
(with variables like ia
, ja
, isc
, ...), but I don't know if the world is ready for that :sweat_smile:
Cool, I think it's the best approach seeing that there might be different users interested in different things. And also it makes sure that the user knows what they are using.
I guess it could also be an
xarray.Dataset
(with variables likeia
,ja
,isc
, ...), but I don't know if the world is ready for that 😅
No, I think not ;) Lets stick with a simple object :)
What do we do with the find_neighbors
method? That one returns a list where each item contains the neighbors of a given atom.
Should each item in the list be a custom object? Or should that distinction (find_neighbors
vs find_pairs
) disappear and the custom object should have a method to iterate over atoms?