lumol icon indicating copy to clipboard operation
lumol copied to clipboard

Specify molecules names in the input file

Open Luthaf opened this issue 7 years ago • 7 comments

This is an user interface improvement proposed by @g-bauer.

The idea is to specify molecules in a section of the input, and then use the molecules by names while specifying moves.

The current input look like this:

[[systems]]
file = "initial.pdb"

[[simulations]]
nsteps = 1000000
[simulations.propagator]
type = "MonteCarlo"
temperature = "500 K"
moves = [
    {type = "Translate", delta = "1 A", , molecule = "CO2.xyz"},
    {type = "Rotate", delta = "20 deg", molecule = "CO2.xyz"},
    {type = "Translate", delta = "2 A", , molecule = "water.xyz"},
    {type = "Rotate", delta = "10 deg", molecule = "water.xyz"},
    {type = "Resize", pressure = "5.00 bar"},
]

The idea is to add a 'molecule' section:

[molecules]
# Specify molecules from a file
water = {file = "water.xyz"}
# Specify molecules by hand inline
h2o = {atoms = ["C", "O", "O"], bonds = [[0, 1], [0, 2]]}

# Specify molecules by hand in a sub-table
[molecule.CO2]
atoms = ["C", "O", "O"]
bonds = [[0, 1], [0, 2]]

[[systems]]
file = "initial.pdb"

[[simulations]]
nsteps = 1000000
[simulations.propagator]
type = "MonteCarlo"
temperature = "500 K"
moves = [
    {type = "Translate", delta = "1 A", , molecule = "CO2"},
    {type = "Rotate", delta = "20 deg", molecule = "CO2"},
    {type = "Translate", delta = "2 A", , molecule = "water"},
    {type = "Rotate", delta = "10 deg", molecule = "h2o"},
    {type = "Resize", pressure = "5.00 bar"},
]

Proposed syntax

We only need the molecule type for most Monte-Carlo moves, which is derived from the atoms names and the bonding information. We can get it the same way as today, from a file:

[molecules]
water = {file = "water.xyz"}

Or explicitly specify it:

[molecules.water]
atoms = ["O", "H", "H"]
bonds = [[0, 1], [0, 2]

For GCMC moves, we also need the atomic positions:

[molecules.co2]
atoms = ["C", "O", "O"]
bonds = [[0, 1], [0, 2]]
positions = [
    [0, 0, 0],
    [1, 0, 0],
    [-1, 0, 0],
]

Luthaf avatar Apr 04 '17 12:04 Luthaf

I really like your proposal.

For GCMC moves, we also need the atomic positions

Actually, we don't need the positions. We can compute them based on the insertion scheme, i.e using configurational bias using the intramolecular potentials.

g-bauer avatar Apr 25 '17 15:04 g-bauer

Yep, that is true! We will need support for rigid molecules in CBMC too then. Do you know if this is possible?

Luthaf avatar Apr 25 '17 15:04 Luthaf

Yes it's quite easy to implement. I'd add fixed bonds/angles as soon as we decide on how to communicate constraints.

g-bauer avatar Apr 25 '17 15:04 g-bauer

Would you also use this declaration of molecules to create or check bonds/angles/dihedrals from the positions in the start configuration?

g-bauer avatar Apr 25 '17 15:04 g-bauer

Creating bonds will be hard: how could we know which atoms are in the same molecules if we don't have the bonds between them?

Checking bonds might be hard too: how do we know that two molecules are equivalent, and that they should have the same bonds? Either they have the same bonds, or we need to check for graph equivalence.

Actually, the way I though of this was to assign molecule types by checking if the definition in the input file and in the initial configuration did match.

On the 'check' side, we could have a flag to ensure that all the molecules in the initial configuration are defined in the input file.

Luthaf avatar Apr 25 '17 15:04 Luthaf

Apologies, I did not express myself very well here.

Actually, the way I though of this was to assign molecule types by checking if the definition in the input file and in the initial configuration did match.

Ok. I thought we'd supply angles and dihedrals here as well. If we demand that particles of a molecule are specified successively within the initial configuration, we could build molecules using these informations - Well, kind of, there could still be ambiguities.

g-bauer avatar Apr 25 '17 15:04 g-bauer

Ok. I thought we'd supply angles and dihedrals here as well.

We don't need to, as we can determine them from the list of bonds only.

And yes, there would still be ambiguities in the generic case. This is mainly due to using chemfiles as input library, and accepting any file format. I feel like this is still a nice feature to have, but we could pick a standard format and add specific features to this format. For example automatic molecules detections !

Luthaf avatar Apr 25 '17 16:04 Luthaf