QCElemental icon indicating copy to clipboard operation
QCElemental copied to clipboard

better "extras" passing in Molecule for EFP

Open loriab opened this issue 6 years ago • 5 comments

  • [x] the enable_qm=True isn't changing anything -- molparse parses QM aspects of the molecule string (enable_efp=False by default)
  • [x] the missing_enabled_return_qm='minimal' bit is changing from the default (error) to (minimal). this can't do much at present as other things prevent an empty QM models.Molecule for the case of efp-only.
        What to do when an enabled domain is of zero-length? Respectively, return
        a fully valid but empty molrec, return empty dictionary, or throw error.
  • [ ] Q: can we have a blank QM mol without throwing all QCA into consternation? The structure I'm working with for pure-efp is:
{
    'symbols': [],
    'geometry': [],
    'extras': {
        'efp_molecule': {
            'symbols': [...],  # filled in by pylibefp
            'geometry': [...],  # filled in by pylibefp
            'extras': {
                 'fragment_files': [...],  # filled in by user
                 'hint_types': [...],  # "
                 'geom_hints': [...],  # "
            }
        }
    }
}
  • [ ] Q: the "# EFP extras" bit helps transmit data and works fine for now. But it brings up an issue with https://github.com/MolSSI/QCElemental/blob/master/qcelemental/models/molecule.py#L241 in that if both kwargs and schema contain non-conflicting leaves in 'extras', then one set is going to get dropped, probably to someone's surprise. may want to consider replacing with a recursive update, like update_with_error without the error.

loriab avatar Sep 03 '19 04:09 loriab

Codecov Report

Merging #124 into master will decrease coverage by 0.02%. The diff coverage is 85.71%.

codecov[bot] avatar Sep 03 '19 05:09 codecov[bot]

What does symbols and geometry look like for EFP? Can we fake it for a normal molecule?

A blank molecule should be doable, I think we only need to change the geometry validation.

dgasmith avatar Sep 03 '19 09:09 dgasmith

whole thing would look something like the below.

  • outer QM mol that's empty,
  • inner EFP mol at extras['efp_molecule']. initially this is blank except for extras. then the "viz" stuff gets filled in by pylibefp
  • inner EFP mol hint at extras['efp_moleule']['extras'] with the fragment files/hint types/etc from molecule string parsing.
 'molecule': {'atom_labels': [],
              'atomic_numbers': [],
              'comment': None,
              'connectivity': None,
              'extras': {'efp_molecule': {'atom_labels': ['_a01o1', '_a02h2', '_a03h3', '_a01n1', '_a02h2', '_a03h3',
                                                          '_a04h4'],
                                          'atomic_numbers': [8, 1, 1, 7, 1, 1, 1],
                                          'extras': {
                                              'fragment_files': ['h2o', 'nh3'],
                                              'geom_hints': [[0.0, 0.0, 0.0, 0.9999999999999999, 2.0, 3.0],
                                                             [9.448630627289141, 0.0, 0.0,  -1.2831853071795865, 2.0, 1.7168146928204135]],
                                              'hint_types': ['xyzabc', 'xyzabc']},
                                          'fix_com': True,
                                          'fix_orientation': True,
                                          'fragment_charges': [0.0, 0.0],
                                          'fragment_multiplicities': [1, 1],
                                          'fragments': [[0, 1, 2], [3, 4, 5, 6]],
                                          'geometry': [-0.0503736105282197, 0.012369110219231472, -0.10722207022473358,
                                                       1.0902316861775359, 1.1318285479113874, 0.6683354255568168,
                                                       -0.2907661375796682, -1.3281352526203078, 1.0333562001803236,
                                                       9.552730766438794, 0.030794165669223276, 0.049682979512153634,
                                                       9.691237188655489, -1.638423923013062, -0.8215271575362131,
                                                       9.011093357817636, 1.3130720013294976, -1.2258061929618227,
                                                       8.197157987609838, -0.10251289104030502, 1.3570206358616195],
                                          'mass_numbers': [-1, -1, -1, -1, -1, -1, -1],
                                          'masses': [15.99491, 1.007825, 1.007825, 14.00307, 1.007825, 1.007825,
                                                     1.007825],
                                          'molecular_charge': 0.0,
                                          'molecular_multiplicity': 1,
                                          'name': 'H5NO',
                                          'provenance': {'creator': 'PylibEFP',
                                                         'routine': 'to_dict',
                                                         'version': '0.6.dev5'},
                                          'real': [True, True, True, True, True, True, True],
                                          'schema_name': 'qcschema_molecule',
                                          'schema_version': 2,
                                          'symbols': ['O', 'H', 'H', 'N', 'H', 'H', 'H'],
                                          'validated': True}},
              'fix_com': True,
              'fix_orientation': True,
              'fix_symmetry': 'c1',
              'fragment_charges': [0.0],
              'fragment_multiplicities': [1],
              'fragments': [[]],
              'geometry': [],
              'id': None,
              'identifiers': None,
              'mass_numbers': [],
              'masses': [],
              'molecular_charge': 0.0,
              'molecular_multiplicity': 1,
              'name': '',
              'provenance': {'creator': 'QCElemental',
                             'routine': 'qcelemental.molparse.from_string',
                             'version': 'v0.7.0+1.g35bbd94.dirty'},
              'real': [],
              'schema_name': 'qcschema_molecule',
              'schema_version': 2,
              'symbols': [],
              'validated': True},

loriab avatar Sep 03 '19 13:09 loriab

Looks pretty close to a QM molecule can we pass it in to the canonical constructor with validation off?

dgasmith avatar Sep 03 '19 14:09 dgasmith

Actually, I think latest commit will do the trick wrt empty QM mol. I had put the "minimal" in the wrong place before and hadn't revisited until you ok'd the empty Mol.

loriab avatar Sep 03 '19 14:09 loriab