pymatgen
pymatgen copied to clipboard
[Breaking] Include Scandium (Sc) and Yttrium (Y) as rare earth metals for `Element`
Summary
Include Scandium (Sc) and Yttrium (Y) as rare earth metals for Element
, see detailed discussions in https://github.com/janosh/pymatviz/pull/140#discussion_r1593909974.
https://github.com/materialsproject/pymatgen/blob/6c696cdb327621d5d16c675478f4ed0e024feea1/pymatgen/core/periodic_table.py#L689-L692
Reference
From the 1985 IUPAC Red Book:
The following collective names for like elements are IUPAC-approved: alkali metals (Li, Na, K, Rb, Cs, Fr), alkaline earth metals (Be, Mg, Ca, Sr, Ba, Ra), pnictogens8 (N, P, As, Sb, Bi), chalcogens (O, S, Se, Te, Po), halogens (F, Cl, Br, I, At), noble gases (He, Ne, Ar, Kr, Xe, Rn), lanthanoids (La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu), rare earth metals (Sc, Y and the lanthanoids) and actinoids (Ac, Th, Pa, U, Np, Pu, Am, Cm, Bk, Cf, Es, Fm, Md, No, Lr)
And Wikipedia:
The rare-earth elements (REE), also called the rare-earth metals or rare earths or, in context, rare-earth oxides, and sometimes the lanthanides (although yttrium and scandium, which do not belong to this series, are usually included as rare earths)
Whether to include actinoids as rare earth metals seems controversial, but there doesn't seem to be diverging views to consider Y and Sc.
Side effect
As this would be a core change, we might expect significant side effects as pointed out by @JaGeo, should we change this after a one-year period?
@janosh. Can you please review this one? Thanks!
Just want to mention here that I expect lot's of code to be broken. Is there a way to notify users of this change or have a deprecation period?
Thanks for commenting!
I'm not sure quite to properly notify users in this case (except for insert a warning in the property
first and change the implementation later), but I believe Janosh would have some suggestions.
Maybe, rename the property and deprecate the old one?
I think there is no reason to rename it (as the name seems proper to me)? Just the implementation need to be corrected.
Maybe I didn't make myself clear enough, perhaps we could first change the property
to:
@property
def is_rare_earth_metal(self) -> bool:
"""True if element is a rare earth metal, including Scandium (Sc),
Yttrium (Y), Lanthanides (La) series and Actinides (Ac) series.
Reference: https://en.wikipedia.org/wiki/Rare-earth_element.
"""
warning.warn("Y and Sc would be considered rare earth metal in a later version of pymatgen.")
return self.is_lanthanoid or self.is_actinoid
And then really change the implementation after some grace period?
But personally such breaking change is acceptable as I would consider it a "fix". What do you think?
Renaming for the deprecation.
I personally don't think it is acceptable without proper deprecation.
Renaming for the deprecation.
Sorry I didn't quite understand.
I personally don't think it is acceptable without proper deprecation.
Yes I'm feeling this is too "core/fundamental" to just change without any warning.
Let's hear what Janosh is going to say :)
Instead of a breaking change, could you @DanielYang59 add a separate property
def is_iupac_rare_earth_metal(self) -> bool:
"""
True if element is a rare earth metal according to IUPAC standards.
See IUPAC guidelines: https://old.iupac.org/publications/books/rbook/Red_Book_2005.pdf
"""
return self.is_lanthanoid or self.is_actinoid or self.Z in {21, 39}
Speaking from experience, it's really hard to estimate what the downstream effects of a breaking change like this are. I unintentionally broke features in one of MP's more highly used codes, matminer, simply by adding data for higher-Z elements to pymatgen's periodic_table.json
data file
That code is not as actively maintained as pymatgen is, and I think most users expect pymatgen functionality to be stable
Instead of a breaking change, could you @DanielYang59 add a separate property
def is_iupac_rare_earth_metal(self) -> bool: """ True if element is a rare earth metal according to IUPAC standards. See IUPAC guidelines: https://old.iupac.org/publications/books/rbook/Red_Book_2005.pdf """ return self.is_lanthanoid or self.is_actinoid or self.Z in {21, 39}
Great suggestion!
I agree with @JaGeo and @esoteric-ephemera - NO BREAKING CHANGES UNLESS ABSOLUTELY NECESSARY.
Instead of a breaking change, could you @DanielYang59 add a separate property
def is_iupac_rare_earth_metal(self) -> bool: """ True if element is a rare earth metal according to IUPAC standards. See IUPAC guidelines: https://old.iupac.org/publications/books/rbook/Red_Book_2005.pdf """ return self.is_lanthanoid or self.is_actinoid or self.Z in {21, 39}
Speaking from experience, it's really hard to estimate what the downstream effects of a breaking change like this are. I unintentionally broke features in one of MP's more highly used codes, matminer, simply by adding data for higher-Z elements to pymatgen's
periodic_table.json
data fileThat code is not as actively maintained as pymatgen is, and I think most users expect pymatgen functionality to be stable
Thanks for the suggestion and sharing your experience, that would be very helpful.
Yes I totally agree adding a property would be safer, but why should we keep a definition that doesn't align with IUPAC standard? I assume this might cause unexpected behavior too.
Let me look into this myself, but on first glance I agree with @DanielYang59 here: we absolutely should follow IUPAC conventions (or similar standards bodies), pymatgen is not the appropriate arbiter of what is or isn’t a rare earth element. This would then be a bug that needs fixing.
From the latest edition of the IUPAC Red Book, 2005, p51. This concurs with the 1985 edition.
I also appreciated this note (p52), since it caused some confusion for myself:
I would therefore consider this a bug. Fixing the bug might indeed be a breaking change: I'm glad to see more caution around breaking changes, they are painful, but for a bug it has to be the correct course of action.
I think consensus seems to be that "rare earth element" is not a useful name in any case, but nevertheless if it is used, we should be consistent with standards.
Fixing the bug might indeed be a breaking change: I'm glad to see more caution around breaking changes, they are painful, but for a bug it has to be the correct course of action.
Not necessarily right? Excel has had a bug since its inception due it assuming incorrectly that the year 1900 was a leap year. This can break excel code between mac excel, which has the epoch as the leap year 1904, and ms-dos-based excel, which has the epoch as the fake leap year 1900. But it's been maintained and documented heavily by Microsoft because it could break a lot of code built on top of it
I prefer the non-breaking solution of raising a warning when using the is_rare_earth_metal
function but keeping it as is and then including a new is_iupac_rare_earth_metal
function as mentioned above
We've had a good amount of software instability lately, especially in atomate2 and emmet, due to breaking changes. I am admittedly being overly cautious
I appreciate the point, and I sympathise with it. I have long been an advocate for not making breaking changes wherever possible -- I have often been the recipient myself of some amount of pain from unnecessary breaking changes.
That said, I think we can make a distinction between a breaking change which is scientifically meaningful, and a breaking change which is not scientifically motivated. The IUPAC creates conventions for good reasons. I am thinking of the student using pymatgen and being misled, for example.
Your suggestion of a warning is a good one if we are very concerned by the change:
I prefer the non-breaking solution of raising a warning when using the is_rare_earth_metal function but keeping it as is and then including a new is_iupac_rare_earth_metal function as mentioned above
This creates some messiness, but I could see this as a compromise. Nevertheless, warnings are often not seen. There is still a danger here. It also does not fix any mistaken usage in downstream code, whose developers might appreciate the fix without additional action being required on their part -- this is the complementary argument.
At least within the Materials Project stack or closely-linked codes, this is also something we can inspect. We can look for usages of is_rare_earth_metal
. As far as I can see, this is not used in emmet
, atomate
, atomate2
, matminer
, crystaltoolkit
or robocrystallographer
. Hopefully this gives us some confidence that a change would not have side effects at least within this stack.
My recommendation would still be to implement the fix.
@janosh Can you please suggest on this one?
@DanielYang59 fwiw, i agree with @mkhorton and the changes you propose in this PR but i prefer for others to make the decision.
given there are precisely zero internal calls to is_rare_earth_metal
in pymatgen
i imagine the number of call sites in downstream packages that would be affected is low. it's also not clear that those call sites would suffer given this can be considered a fix. they may actually start working as intended by this change
I suggest we do a correct is_rare_earth
implementation, and leave is_rare_earth_metal
with a deprecation message until 2025.1.1
I suggest we do a correct
is_rare_earth
implementation, and leaveis_rare_earth_metal
with a deprecation message until 2025.1.1
I like this idea, because rare_earth_metal
itself doesn't seem like a widely recognized terminology :)
I would implement this tomorrow
I hope everyone is happy now :)
Great, @DanielYang59 ! I am happy as well with this solution!