QCSchema icon indicating copy to clipboard operation
QCSchema copied to clipboard

QCSchema with PBC?

Open awvwgk opened this issue 4 years ago • 22 comments

I looked at the specifications of the QCSchema at https://molssi-qc-schema.readthedocs.io/en/latest/auto_topology.html and noticed that I cannot pass lattice parameters or the systems periodicity with the topology.

Are you planning to add periodic boundary conditions to QCSchema?

awvwgk avatar Sep 17 '19 09:09 awvwgk

This has always seemed a bit out of scope to me. There was some strong push for it originally, but it seems that Materials Project, AIDIIA, Nomad, ASE, Citrine, etc cover this space fairly well.

I wouldn't rule it out entirely, but I think there would need to be some strong use cases for it.

dgasmith avatar Sep 17 '19 13:09 dgasmith

Yes, there are competing standards out there and one has to support more than one in the end. So it is important to know if I can (potentially) use QCSchema in the programs I am developing, which necessarily requires having PBC.

awvwgk avatar Sep 18 '19 11:09 awvwgk

@loriab @wadejong Any thoughts here?

dgasmith avatar Sep 18 '19 13:09 dgasmith

Relevant for molecular dynamics and materials science. But, it’s becoming also relevant in the self assembly research of mixed organic-inorganic molecular assemblies. I would say it is and should be in scope.

On Sep 18, 2019, at 4:17 AM, Sebastian Ehlert [email protected] wrote:

Yes, there are competing standards out there and one has to support more than one in the end. So it is important to know if I can (potentially) use QCSchema in the programs I am developing, which necessarily requires having PBC.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

wadejong avatar Sep 18 '19 13:09 wadejong

@mkhorton does MP have schema that we can borrow for this?

dgasmith avatar Sep 18 '19 13:09 dgasmith

I think it would be good to have at least PBC / unit vectors.

ghutchis avatar Sep 18 '19 14:09 ghutchis

So do I, we added them in Chemical JSON quite some time ago. We have never mixed orbitals and periodic boundary conditions together.

cryos avatar Sep 18 '19 14:09 cryos

@cryos Could you post that schema here or go ahead and make a PR?

dgasmith avatar Sep 18 '19 16:09 dgasmith

I found chemical JSON is using a representation with cell parameters.

"unit cell": {"a": 9.1, "b": 9.1, "c": 30.5, "alpha": 90, "beta": 90, "gamma": 120}

This format allows to easily specify 1D and 2D periodic systems and possibly spacegroup symmetry.

A band structure might be a bit more complicated and could be out-of-scope, but the stress tensor would be nice to have in the output.

awvwgk avatar Sep 30 '19 08:09 awvwgk

I think there's also a cell vector representation in chemical JSON. In my opinion, that's much better, since the lengths and angles are easy to derive from the latter, but the cell vectors contain more information. (Consider that a 3x3 matrix of cell vectors contains 9 parameters, but the a/b/c alpha/beta/gamma only contains 6.)

ghutchis avatar Sep 30 '19 15:09 ghutchis

Had to go check the code, the syntax should be:

"cellVectors": { 10.5, 0.0, 0.0, 0.0, 10.5, 0.0, 0.0, 0.0, 10.5 }

In other words, the keyword cellVectors and then 9 distances in Angstrom. Avogadro v2 master writes and both, but prefers the cellVectors if present.

I can certainly change the code to handle 1D and 2D cases. I'd probably just hack it at the moment by setting the other dimensions to be 10^6 or something similar.

ghutchis avatar Sep 30 '19 15:09 ghutchis

In the Materials Project JSON we use:

'lattice': {'matrix': [[2.32547694, 0.0, -0.82218026],
   [-1.16273847, 2.01392211, -0.82218026],
   [0.0, 0.0, 2.46654077]],
  'a': 2.4665407716892562,
  'b': 2.4665407749920663,
  'c': 2.46654077,
  'alpha': 109.47122067561966,
  'beta': 109.47122070274486,
  'gamma': 109.47122053911131,
  'volume': 11.55162296798055},

The canonical information is the matrix since this unambiguously includes an orientation also, the additional information (a, b, c, alpha, beta, gamma, volume) are solely for human readability.

mkhorton avatar Sep 30 '19 19:09 mkhorton

One additional factor to consider with PBC is whether you want to specific PBC along some directions only. Mixed boundary conditions present challenges, and we do not support them, but some tools (LAMMPS and ASE) do use them.

mkhorton avatar Sep 30 '19 19:09 mkhorton

Finally, I think I mentioned this previously but it's an important point, if your schema can include information on chemical bonds (or any other kind of connectivity/"edge" information between atomic sites), then you have to think about how this information is stored with periodic boundaries also. For example, if you have a bond from your original unit cell bonded to an atom in the next periodic image cell over.

mkhorton avatar Sep 30 '19 19:09 mkhorton

I'm bumping this issue, since I'm still interested in having PBCs in QCSchema.

At the ASE developer workshop we shortly discussed PBCs and topology and they seems to be no problem with having both. Here is the merge request from their gitlab for reference: https://gitlab.com/ase/ase/merge_requests/1321

awvwgk avatar Jan 24 '20 18:01 awvwgk

There was only an issue for connecting an atom with its own image, of course. But this is a pretty special case for very small unit cells.

awvwgk avatar Jan 24 '20 18:01 awvwgk

But this is a pretty special case for very small unit cells.

Yes, agreed -- and a very important case if you deal with a lot of small unit cells. In pymatgen we've ended up storing the destination lattice vector (e.g. (1, 0, 0) for the next periodic image in the +x direction) along with the atom pair defining the bond in our serialization format. Regardless of implementation though, there's definitely a need for it.

mkhorton avatar Jan 24 '20 18:01 mkhorton

Can anyone make a PR here to put these discussions into practice?

dgasmith avatar Jan 24 '20 18:01 dgasmith

Sorry for reviving this old thread. How would we implement this in QCSchema, I'm a bit unsure about the actual workflow here.

As far as I understood this repo is output from QCElemental rather than a standalone project which is implemented in QCElemental, right? Implementing this in QCElemental would therefore be the first step?

awvwgk avatar Jan 20 '22 22:01 awvwgk

@awvwgk - many of us in the community pushed very strongly that this was the opposite. Discussion should happen here. If an implementation happens in QCElemental, it should be later.

(There are many of us following this repo and most of the discussion threads have been here.)

ghutchis avatar Jan 21 '22 00:01 ghutchis

I think I'd favor something similar to @mkhorton and the Materials Project schema. The lattice matrix is the most important and unambiguous if available, with a, b, c, alpha, beta, gamma being useful for humans or a quick glance.

ghutchis avatar Jan 21 '22 00:01 ghutchis

As far as I understood this repo is output from QCElemental rather than a standalone project which is implemented in QCElemental, right? Implementing this in QCElemental would therefore be the first step?

We are waiting on https://github.com/MolSSI/QCSchema/pull/77 and related things marked there, if the final decision is to generate from QCElemental.

berquist avatar Jan 21 '22 03:01 berquist