netket Gp/continuous geometry

First draft to implement the geometry for continuous space. There is continous_geometry.py as base class for specialized ones: Cell, Free. At the moment only these two are available (so no mix of boundary conditions). I commented out all tests in hilbert that test for mixed boundary conditions as well as the Langevin sampler. Examples run fine on my machine.

Sep 20 '23 15:09 gpescia

We should decide where to put the geometry stuff. I'm not really in favour of .graph let's see what @gcarleo thinks about
We should not break old interfaces. So ContinuousHilbert should still accept the old arguments, and also geometry. If geometry is none it should use the other arguments to build a geomtry
This way of implementing the random states is not ok

Before actually implementing this stuff, I think someone should lay out clearly what we want to do with it. For example, do we need the inverse lattice? Why? What is the expected use case of this?

For example, the messy translation code in the sampling makes me suspect that the abstract class should have a method translate_positions_by_vector or something like that...

Sep 20 '23 15:09 PhilipVinc

We should not break old interfaces. So ContinuousHilbert should still accept the old arguments, and also geometry. If geometry is none it should use the other arguments to build a geomtry

yes true, I‘ll add the other as well.

This way of implementing the random states is not ok

In the end I guess we want to dispatch on cell/free.

For example, do we need the inverse lattice? Why? What is the expected use case of this?

yes to go from lattice basis to standard basis and back.

Sep 20 '23 15:09 gpescia

yes to go from lattice basis to standard basis and back.

In what part of the code is this needed? If this is not used in the code, can you give an example of what it can be used for and how it would be used?

Sep 20 '23 15:09 PhilipVinc

Yes for the position of the module, it's a secondary problem, but I also wouldn't put it in Graph. Anyways, there is time to think about that...

In my view the Geometry object should have a minimal interface to start with allowing to:

Compute distances between any two points R1,R2 that are valid for the given Geometry. This method is just distance and should know nothing about periodicity etc explicitly (i.e. you don't pass pbc=True etc to distance) it is the essence of the Geometry to know how the distance should be computed (e.g. with minimum image etc)
Generate a valid random point R
Add/subtract vectors in this space (say implement R_1 + R_2)? (not sure about this point but maybe it's handy for sampling)

Sep 20 '23 15:09 gcarleo

In the sampler and the hilbert space random state. We sample fractional coordinates, for this we need to go from standard to lattice basis back to standard

Sep 20 '23 15:09 gpescia

In the sampler and the hilbert space random state. We sample fractional coordinates, for this we need to go from standard to lattice basis back to standard

but that's why the random points should be generated in the Geometry, not in Hilbert, see point 2 of the list I made above. I think this allows to simplify most of the code used for sampling

Sep 20 '23 15:09 gcarleo

Well you need to tell the function whether you want periodic in the cell or minimum image convention. For this you need the ‚mode’ I would say. In the free case you don‘t need it.
The only difference is the placement of the random state right? One still goes nack and forth between the two bases.

Sep 20 '23 15:09 gpescia

Yes but you specify periodic or minimum image when you construct Cell, not when you call Distance, in this way you completely separate the sampling and Hilbert from these details of the geometry
It's important to disentangle everything that relates to Geometry from Hilbert and the Sampling, so that you have a general code for the last two that can work for any Geometry

Sep 20 '23 15:09 gcarleo

Well you need to tell the function whether you want periodic in the cell or minimum image convention. For this you need the ‚mode’ I would say. In the free case you don‘t need it.

Are there cases where for the same geometry one wants one of those two different distances? Can you give an example? Can we think a case of a default distance to be used mostly everywhere? In that case Geometry objects could expose different distance functions, and a default one to be used for interaction potentials or I don't know what else.

Sep 20 '23 15:09 PhilipVinc

I think there is no reasonable case in which you want to use two different distances within the same simulation Cell, that's why I think the Cell type should be used to identify uniquely the type of distance used as well

Sep 20 '23 15:09 gcarleo

Well potential energies are computed always with minimum image, and distances for the wf never. So you need both…

Sep 20 '23 15:09 gpescia

What do you mean by distances for the wf ?

Sep 20 '23 15:09 PhilipVinc

Codecov Report

Attention: 60 lines in your changes are missing coverage. Please review.

Comparison is base (ee98fef) 83.53% compared to head (a88a945) 84.22%. Report is 7 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1583      +/-   ##
==========================================
+ Coverage   83.53%   84.22%   +0.69%     
==========================================
  Files         257      261       +4     
  Lines       14519    15155     +636     
  Branches     2203     2232      +29     
==========================================
+ Hits        12129    12765     +636     
- Misses       1841     1852      +11     
+ Partials      549      538      -11

Files	Coverage Δ
netket/experimental/__init__.py	`100.00% <100.00%> (ø)`
netket/experimental/geometry/__init__.py	`100.00% <100.00%> (ø)`
netket/hilbert/continuous_hilbert.py	`100.00% <100.00%> (+22.22%)`	:arrow_up:
netket/hilbert/random/particle.py	`84.21% <100.00%> (-15.79%)`	:arrow_down:
netket/models/deepset.py	`94.44% <100.00%> (+1.11%)`	:arrow_up:
netket/operator/_kinetic.py	`94.82% <100.00%> (ø)`
netket/sampler/rules/continuous_gaussian.py	`89.47% <100.00%> (-1.44%)`	:arrow_down:
...etket/experimental/geometry/continuous_geometry.py	`83.33% <83.33%> (ø)`
netket/hilbert/particle.py	`87.03% <78.78%> (-10.47%)`	:arrow_down:
netket/experimental/geometry/Free.py	`53.65% <53.65%> (ø)`
... and 1 more

... and 76 files with indirect coverage changes

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Sep 20 '23 15:09 codecov[bot]

It's important to disentangle everything that relates to Geometry from Hilbert and the Sampling, so that you have a general code for the last two that can work for any Geometry

But random state is actually a state in hilbert not in the geometry, right?

Sep 20 '23 15:09 gpescia

distance features used in the wavefunction

Sep 20 '23 15:09 gpescia

sorry Gabriel but is it truly necessary to use two different distances? I would say one should be enough / is much nicer than using two different distances for the same space

Sep 20 '23 16:09 gcarleo

in any case, maybe it's more efficient to talk on zoom tomorrow or so

Sep 20 '23 16:09 gcarleo

Sure we can have a meeting! I‘m just saying that there are two distance measures which both get used. I‘m not sure how to optimally resolve that..

Sep 20 '23 16:09 gpescia

I would love to understand why you are using more than one distance. If I understand it's because in the nn/variational ansatz you want to use the standard distance instead of the minimum image. But physically this, I think, is wrong, because the minimum image is the physical one in a periodic system. So what is the underlying reason of this? Performance? It works? Or maybe I fail to understand?

Sep 20 '23 16:09 PhilipVinc

The reason is smootheness. The minimum image convention reproduces Euclidean distance as you say but it is not smooth and cannot be used as input to the wavefunction as we take derivatives w.r.t. position. The other distance (sin-distance) behaves similar as the minimum image distance but is smooth. So for potential energies (no derivatives involved) we want the "correct" euclidean distance. For the wavefunction we take the less correct but smooth version.

Sep 20 '23 16:09 gpescia

Thanks! This was the explanation I was looking for. I think that to maintain flexibility we could should just implement both as two different functions. Then in the models, one uses what he wants. But to be further discussed..

Sep 20 '23 16:09 PhilipVinc

OK, not ideal, but then this is solved maybe specifying what type of distance you want, it's true that even in euclidean space you might want L1,L2... etc

Sep 20 '23 16:09 gcarleo

sure we can do that too. But that is basically the same as setting a 'mode' flag or do you gain performance with this?

Sep 20 '23 16:09 gpescia

Yes I think it's just a flag to be passed: geo.Distance(R1,R2,type='L2') etc

Sep 20 '23 16:09 gcarleo

It's not about performance. It's about having a clean and sane API. I think geometry.distance should always give the physical distance, aka minimum image. In some cases you might want to cheat / use non physical distances in which case those should have different names. mode might hide them a bit. But I'm not completely convinced. Maybe mode is good. Would like to hear opinions.

Sep 20 '23 16:09 PhilipVinc

if you want the main problem is where the type flag is to be used. I think for the ansatz there is no problem since you will have an ansatz specific to periodic systems that will also know about different types of distance that a Cell can accept. Now we have to understand how we make the sampler agnostic somehow of the type of Geometry

@PhilipVinc the physical distance in a sense is not well defined, it might be that it is L1 that makes sense and not L2, depending on the problem (just to give an example). In the same spirit for the periodic system one type might make more sense than the other type of distance...

Sep 20 '23 16:09 gcarleo

Now we have to understand how we make the sampler agnostic somehow of the type of Geometry

The main difficulty from my view is that for periodic systems you need to fold back positions into the box that went outside, which you don't have to do for free systems. So there is inherently a distinction between the update rules.

Sep 20 '23 16:09 gpescia

But do we need the distance in the sampling?

Sep 20 '23 16:09 PhilipVinc

No in the sampling the distance is irrelevant (right now) except for the model that is sampled.

Sep 20 '23 16:09 gpescia

The main difficulty from my view is that for periodic systems you need to fold back positions into the box that went outside, which you don't have to do for free systems. So there is inherently a distinction between the update rules.

That's why I was suggesting adding a geometry.translate function that takes care of that

Sep 20 '23 16:09 PhilipVinc