pymatgen icon indicating copy to clipboard operation
pymatgen copied to clipboard

Add f* diagram generator

Open jonathanjdenney opened this issue 5 years ago • 8 comments

Summary

Added f* diagram generator.

Additional dependencies introduced (if any)

  • List all new dependencies needed and justify why. While adding dependencies that bring significantly useful functionality is perfectly fine, adding ones that add trivial functionality, e.g., to use one single easily implementable function, is frowned upon. Provide a justification why that dependency is needed. Especially frowned upon are circular dependencies, e.g., depending on derivative modules of pymatgen such as custodian or Fireworks.

TODO (if any)

If this is a work-in-progress, write something about what else needs to be done

Checklist

Work-in-progress pull requests are encouraged, but please put [WIP] in the pull request title.

Before a pull request can be merged, the following items must be checked:

Note that the CI system will run all the above checks. But it will be much more efficient if you already fix most errors prior to submitting the PR. It is highly recommended that you use the pre-commit hook provided in the pymatgen repository. Simply cp pre-commit .git/hooks and a check will be run prior to allowing commits.

jonathanjdenney avatar Oct 20 '20 17:10 jonathanjdenney

Hi @jonathanjdenney, thanks for this.

A few comments before merging, mostly just to follow Python conventions:

  • The file pymatgen/analysis/Fstar/__pycache__/fstar.cpython-36.pyc should be removed
  • The folder pymatgen/analysis/Fstar/ should be renamed to pymatgen/analysis/fstar/
  • The files neutron_factors.csv xray_factors_2.csv are duplicated in the tests folder, this is unnecessary.
  • FStarDiagram_test -> test_fstardiagram (method names should be lower case)
  • x_ray_scatter_df, neutron_scatter_df should be X_RAY_SCATTER_DF, NEUTRON_SCATTER_DF (global variables upper case)
  • instead of try except AttributeError you can do an if statement, e.g. if isinstance(sp, Species or if hasattr(sp, "element") etc.

I see changes were made via direct file upload, I highly recommend GitHub Desktop for making easier edits.

These are all my suggestions for now, but I'll take another look after the above changes.

mkhorton avatar Dec 01 '20 17:12 mkhorton

Also, I would add a reference to the papers using f* (e.g. https://doi.org/10.1063/1.5044555 https://doi.org/10.1021/acs.chemmater.9b03646) to the docstring to point newcomers in the right direction.

mkhorton avatar Dec 01 '20 17:12 mkhorton

Hi @mkhorton , I will get to work on your suggestions. Thanks for looking at it.

jonathanjdenney avatar Dec 01 '20 17:12 jonathanjdenney

Hi @mkhorton , I believe I have addressed all of your comments. Please let me know if I missed anything.

jonathanjdenney avatar Dec 15 '20 17:12 jonathanjdenney

Thanks @jonathanjdenney ! I'll take another look. Were some tests removed in the most recent iteration?

mkhorton avatar Dec 16 '20 04:12 mkhorton

@mkhorton No tests were actually removed. I had previously accidentally added a file that was a non-functioning version of the test file which I have now removed.

jonathanjdenney avatar Dec 16 '20 15:12 jonathanjdenney

@mkhorton Hello Matt, Sorry for not getting to your comments sooner, I've had other deadlines. The purpose of including a way to input cifs directly into the script was to make it more user friendly to our target audience, experimentalists. Experimentalists will typically be more familiar with cifs over structure objects. In addition, this method allows the user to easily input non-physical values for the occupancies which may have resulted from the refinement. As far as I know, the cif parser will round greater than 1 occupancies to 1 regardless of what tolerance you set. This script can take structure objects as its input without any trouble, so is it really unacceptable that I've given the option to use cifs? In regards to your last comment I agree it is a bit confusing. I will add to the code description an explanation that you use the labels generated in the site_labels list, which can be viewed by printing the list. It is input as a list of strings. Best, Jonathan

jonathanjdenney avatar Mar 04 '21 16:03 jonathanjdenney

In addition, this method allows the user to easily input non-physical values for the occupancies which may have resulted from the refinement. As far as I know, the cif parser will round greater than 1 occupancies to 1 regardless of what tolerance you set.

If there's a case where unphysical occupancies are useful, then we should change CifParser accordingly.

The purpose of including a way to input cifs directly into the script was to make it more user friendly to our target audience, experimentalists. This script can take structure objects as its input without any trouble, so is it really unacceptable that I've given the option to use cifs?

Maybe to explain the issue a different way, pymatgen is intended as a library. For someone using it, regardless of their background or expertise (I started off as an experimentalist myself), you have to learn how to use Python, how to install the package, how to import the relevant functionality you want etc.

Therefore, with the recommendation to people using the script will be to say, "first you write one line to import your CIFs" and then "you run the FStarDiagram generator", in my own view the extra directive to write the additional line is not the limiting hurdle for using a library like pymatgen.

The objection to having the CIF input is that it creates confusion and makes it more difficult to reason about what the code is doing internally. The existence of the Structure object is not to create yet another file format (e.g. on Materials Project, we offer CIF downloads, not Structure downloads), it's to create a standardized interface to crystal structures internally within pymatgen.

More importantly however, this discussion has highlighted a limitation (the CIF unphysical occupancies mentioned), and we'd be papering over this limitation rather than addressing it properly. By adding the capability to CifParser, other people can also have the benefit of importing CIF with unphysical occupancies, and the resulting code within FStarDiagram will be a lot simpler too.

mkhorton avatar Mar 04 '21 19:03 mkhorton