librascal icon indicating copy to clipboard operation
librascal copied to clipboard

Document provenance of reference atomic configurations

Open max-veit opened this issue 5 years ago • 10 comments

Spun off from #181 (see here): We have many atomic configurations just called e.g. methane.xyz and H2O.xyz, and it might be useful to know how the geometry itself was obtained (from a database, or optimized, and if so, with which method?). These can be added as a comment in the (extended) XYZ file itself, by adding a tag like comment="from GDB9, re-optimized with B3LYP/6-31G*" to the comment line).

max-veit avatar Nov 20 '19 16:11 max-veit

(BTW for json-formatted reference configurations it's even easier to add a comment field. Just add a new entry labelled "comment" and all our readers should ignore it.)

max-veit avatar Nov 20 '19 16:11 max-veit

(BTW for json-formatted reference configurations it's even easier to add a comment field. Just add a new entry labelled "comment" and all our readers should ignore it.)

I agree with you the json format is perfect for that and it is more convenient to have self documenting files.

felixmusil avatar Nov 21 '19 10:11 felixmusil

Should everyone change her/his contributions in #181 or how will we handle this?

mastricker avatar Nov 28 '19 08:11 mastricker

Should everyone change her/his contributions in #181 or how will we handle this?

I think it would be best if everyone took a bit and identified which data files came from them or that they know the provenance of. I'll list the files in reference_data/* here, and please comment if you can take responsibility for a file and then check it off.

inputs:

  • [ ] 1-Propanol.xyz
  • [ ] 2-Propanol.xyz
  • [x] alanine-center-select.json
  • [x] alanine-X-examples.json
  • [x] alanine-X.json
  • [ ] benzene.xyz
  • [ ] CaCrP2O7_mvc-11955_symmetrized.json
  • [x] crystal_structure.json
  • [ ] dft-smiles_500.xyz
  • [x] diamond_2atom_distorted.json
  • [x] diamond_2atom.json
  • [x] diamond_cubic_distorted.json
  • [x] diamond_cubic.json
  • [ ] dummy_structure.json
  • [ ] dummy_structure_wrapped.json
  • [ ] H2O.xyz
  • [ ] mbfs_derivative_test.json
  • [x] methane.json
  • [x] methane.xyz
  • [ ] Methoxyethane.xyz
  • [ ] modified_bessel_first_kind_reference.ubjson
  • [ ] molecular_crystal.json
  • [ ] N3.xyz
  • [x] polyalanine.json
  • [ ] S6.xyz
  • [x] SiCGe_wurtzite_like.json
  • [x] SiC_moissanite.json
  • [x] SiC_moissanite_supercell.json
  • [x] simple_cubic_8.json
  • [x] simple_cubic_9.json
  • [ ] small_molecule.json
  • [ ] small_molecule_no_cell.json
  • [ ] small_molecules-1000.xyz
  • [ ] water_rotations.xyz

tests_only:

  • [ ] cutoff_function_test.json
  • [ ] dft-smiles_500.ubjson
  • [ ] gauss_legendre_reference.ubjson
  • [ ] hyp1f1_reference.ubjson
  • [ ] kernel_reference.ubjson
  • [ ] modified_bessel_first_kind_reference.json
  • [x] radial_derivative_test.json
  • [ ] sorted_coulomb_reference.ubjson
  • [ ] spherical_covariants_reference.ubjson
  • [x] spherical_expansion_gradient_test.json
  • [ ] spherical_expansion_reference.ubjson
  • [ ] spherical_harmonics_gradient_test.json
  • [ ] spherical_harmonics_reference.ubjson
  • [x] spherical_harmonics_test.json
  • [ ] spherical_invariants_reference.ubjson

unused:

  • [x] behler_parinello_pair_hypers.json
  • [ ] CaCrP2O7_mvc-11955_symmetrized.cif
  • [ ] primitive_cubic.json
  • [x] simple_cubic_3.json

rosecers avatar Nov 28 '19 11:11 rosecers

And directly change it in #230 ?

mastricker avatar Nov 28 '19 12:11 mastricker

And directly change it in #230 ?

I'd like to merge in #230 sooner rather than later. I think right now, the first goal is to have each file claimed, and then we can create a separate branch/PR for labeling. Thoughts?

rosecers avatar Nov 28 '19 13:11 rosecers

The methane.{xyz,json} were from me (surprise, surprise); I think they were just the ASE G2-database structures, but I'll have to check. I think I also added the diamond, moissanite, and wurtzite structures for gradient testing based off of cell parameters from Wikipedia.

But yes, let's merge the PR first and then add these as we go along in another branch.

max-veit avatar Nov 28 '19 14:11 max-veit

The methane.{xyz,json} were from me (surprise, surprise); I think they were just the ASE G2-database structures, but I'll have to check. I think I also added the diamond, moissanite, and wurtzite structures for gradient testing based off of cell parameters from Wikipedia.

But yes, let's merge the PR first and then add these as we go along in another branch.

Max, can you check off the files that belong to you?

rosecers avatar Dec 03 '19 14:12 rosecers

Mine, according to "user" tag are

  • [x] ./unused/simple_cubic_3.json: "user": "markus"},
  • [x] ./inputs/simple_cubic_9.json: "user": "markus"},
  • [x] ./inputs/alanine-center-select.json: "user": "markus"},
  • [x] ./inputs/alanine-X-examples.json: "user": "markus"},
  • [x] ./inputs/polyalanine.json: "user": "markus stricker"},
  • [x] ./inputs/crystal_structure.json: "user": "markus stricker"},
  • [x] ./inputs/alanine-X.json: "user": "markus"},

I am writing them here for reference.

Should everybody do her/his own PR for the metadata change?

Again just for reference,I'll also claim:

  • [x] ./inputs/simple_cubic_8.json
  • [x] ./unused/behler_parinello_pair_hypers.json # will be used!

I start a Draft-PR with my changes and people can alter their claimed jsons

mastricker avatar Dec 03 '19 14:12 mastricker

Draft PR branch open to every file-claimer at #236

mastricker avatar Dec 03 '19 15:12 mastricker