pymatgen icon indicating copy to clipboard operation
pymatgen copied to clipboard

[Breaking] Include Scandium (Sc) and Yttrium (Y) as rare earth metals for `Element`

Open DanielYang59 opened this issue 9 months ago • 7 comments

Summary

Include Scandium (Sc) and Yttrium (Y) as rare earth metals for Element, see detailed discussions in https://github.com/janosh/pymatviz/pull/140#discussion_r1593909974.

https://github.com/materialsproject/pymatgen/blob/6c696cdb327621d5d16c675478f4ed0e024feea1/pymatgen/core/periodic_table.py#L689-L692

Reference

From the 1985 IUPAC Red Book:

The following collective names for like elements are IUPAC-approved: alkali metals (Li, Na, K, Rb, Cs, Fr), alkaline earth metals (Be, Mg, Ca, Sr, Ba, Ra), pnictogens8 (N, P, As, Sb, Bi), chalcogens (O, S, Se, Te, Po), halogens (F, Cl, Br, I, At), noble gases (He, Ne, Ar, Kr, Xe, Rn), lanthanoids (La, Ce, Pr, Nd, Pm, Sm, Eu, Gd, Tb, Dy, Ho, Er, Tm, Yb, Lu), rare earth metals (Sc, Y and the lanthanoids) and actinoids (Ac, Th, Pa, U, Np, Pu, Am, Cm, Bk, Cf, Es, Fm, Md, No, Lr)

And Wikipedia:

The rare-earth elements (REE), also called the rare-earth metals or rare earths or, in context, rare-earth oxides, and sometimes the lanthanides (although yttrium and scandium, which do not belong to this series, are usually included as rare earths)

Whether to include actinoids as rare earth metals seems controversial, but there doesn't seem to be diverging views to consider Y and Sc.

Side effect

As this would be a core change, we might expect significant side effects as pointed out by @JaGeo, should we change this after a one-year period?

DanielYang59 avatar May 09 '24 12:05 DanielYang59

@janosh. Can you please review this one? Thanks!

DanielYang59 avatar May 11 '24 09:05 DanielYang59

Just want to mention here that I expect lot's of code to be broken. Is there a way to notify users of this change or have a deprecation period?

JaGeo avatar May 11 '24 09:05 JaGeo

Thanks for commenting!

I'm not sure quite to properly notify users in this case (except for insert a warning in the property first and change the implementation later), but I believe Janosh would have some suggestions.

DanielYang59 avatar May 11 '24 09:05 DanielYang59

Maybe, rename the property and deprecate the old one?

JaGeo avatar May 11 '24 09:05 JaGeo

I think there is no reason to rename it (as the name seems proper to me)? Just the implementation need to be corrected.

Maybe I didn't make myself clear enough, perhaps we could first change the property to:

@property
def is_rare_earth_metal(self) -> bool:
    """True if element is a rare earth metal, including Scandium (Sc),
    Yttrium (Y), Lanthanides (La) series and Actinides (Ac) series.
    Reference: https://en.wikipedia.org/wiki/Rare-earth_element.
    """
    warning.warn("Y and Sc would be considered rare earth metal in a later version of pymatgen.")
    return self.is_lanthanoid or self.is_actinoid

And then really change the implementation after some grace period?

But personally such breaking change is acceptable as I would consider it a "fix". What do you think?

DanielYang59 avatar May 11 '24 09:05 DanielYang59

Renaming for the deprecation.

I personally don't think it is acceptable without proper deprecation.

JaGeo avatar May 11 '24 09:05 JaGeo

Renaming for the deprecation.

Sorry I didn't quite understand.

I personally don't think it is acceptable without proper deprecation.

Yes I'm feeling this is too "core/fundamental" to just change without any warning.

Let's hear what Janosh is going to say :)

DanielYang59 avatar May 11 '24 09:05 DanielYang59

Instead of a breaking change, could you @DanielYang59 add a separate property

def is_iupac_rare_earth_metal(self) -> bool:
    """
    True if element is a rare earth metal according to IUPAC standards.
    
    See IUPAC guidelines: https://old.iupac.org/publications/books/rbook/Red_Book_2005.pdf
    """
    return self.is_lanthanoid or self.is_actinoid or self.Z in {21, 39}

Speaking from experience, it's really hard to estimate what the downstream effects of a breaking change like this are. I unintentionally broke features in one of MP's more highly used codes, matminer, simply by adding data for higher-Z elements to pymatgen's periodic_table.json data file

That code is not as actively maintained as pymatgen is, and I think most users expect pymatgen functionality to be stable

esoteric-ephemera avatar May 15 '24 18:05 esoteric-ephemera

Instead of a breaking change, could you @DanielYang59 add a separate property

def is_iupac_rare_earth_metal(self) -> bool:
    """
    True if element is a rare earth metal according to IUPAC standards.
    
    See IUPAC guidelines: https://old.iupac.org/publications/books/rbook/Red_Book_2005.pdf
    """
    return self.is_lanthanoid or self.is_actinoid or self.Z in {21, 39}

Great suggestion!

JaGeo avatar May 15 '24 18:05 JaGeo

I agree with @JaGeo and @esoteric-ephemera - NO BREAKING CHANGES UNLESS ABSOLUTELY NECESSARY.

shyuep avatar May 15 '24 18:05 shyuep

Instead of a breaking change, could you @DanielYang59 add a separate property

def is_iupac_rare_earth_metal(self) -> bool:
    """
    True if element is a rare earth metal according to IUPAC standards.
    
    See IUPAC guidelines: https://old.iupac.org/publications/books/rbook/Red_Book_2005.pdf
    """
    return self.is_lanthanoid or self.is_actinoid or self.Z in {21, 39}

Speaking from experience, it's really hard to estimate what the downstream effects of a breaking change like this are. I unintentionally broke features in one of MP's more highly used codes, matminer, simply by adding data for higher-Z elements to pymatgen's periodic_table.json data file

That code is not as actively maintained as pymatgen is, and I think most users expect pymatgen functionality to be stable

Thanks for the suggestion and sharing your experience, that would be very helpful.

Yes I totally agree adding a property would be safer, but why should we keep a definition that doesn't align with IUPAC standard? I assume this might cause unexpected behavior too.

DanielYang59 avatar May 16 '24 01:05 DanielYang59

Let me look into this myself, but on first glance I agree with @DanielYang59 here: we absolutely should follow IUPAC conventions (or similar standards bodies), pymatgen is not the appropriate arbiter of what is or isn’t a rare earth element. This would then be a bug that needs fixing.

mkhorton avatar May 16 '24 01:05 mkhorton

image

From the latest edition of the IUPAC Red Book, 2005, p51. This concurs with the 1985 edition.

I also appreciated this note (p52), since it caused some confusion for myself:

image

I would therefore consider this a bug. Fixing the bug might indeed be a breaking change: I'm glad to see more caution around breaking changes, they are painful, but for a bug it has to be the correct course of action.

I think consensus seems to be that "rare earth element" is not a useful name in any case, but nevertheless if it is used, we should be consistent with standards.

mkhorton avatar May 16 '24 02:05 mkhorton

Fixing the bug might indeed be a breaking change: I'm glad to see more caution around breaking changes, they are painful, but for a bug it has to be the correct course of action.

Not necessarily right? Excel has had a bug since its inception due it assuming incorrectly that the year 1900 was a leap year. This can break excel code between mac excel, which has the epoch as the leap year 1904, and ms-dos-based excel, which has the epoch as the fake leap year 1900. But it's been maintained and documented heavily by Microsoft because it could break a lot of code built on top of it

I prefer the non-breaking solution of raising a warning when using the is_rare_earth_metal function but keeping it as is and then including a new is_iupac_rare_earth_metal function as mentioned above

We've had a good amount of software instability lately, especially in atomate2 and emmet, due to breaking changes. I am admittedly being overly cautious

esoteric-ephemera avatar May 16 '24 03:05 esoteric-ephemera

I appreciate the point, and I sympathise with it. I have long been an advocate for not making breaking changes wherever possible -- I have often been the recipient myself of some amount of pain from unnecessary breaking changes.

That said, I think we can make a distinction between a breaking change which is scientifically meaningful, and a breaking change which is not scientifically motivated. The IUPAC creates conventions for good reasons. I am thinking of the student using pymatgen and being misled, for example.

Your suggestion of a warning is a good one if we are very concerned by the change:

I prefer the non-breaking solution of raising a warning when using the is_rare_earth_metal function but keeping it as is and then including a new is_iupac_rare_earth_metal function as mentioned above

This creates some messiness, but I could see this as a compromise. Nevertheless, warnings are often not seen. There is still a danger here. It also does not fix any mistaken usage in downstream code, whose developers might appreciate the fix without additional action being required on their part -- this is the complementary argument.

At least within the Materials Project stack or closely-linked codes, this is also something we can inspect. We can look for usages of is_rare_earth_metal. As far as I can see, this is not used in emmet, atomate, atomate2, matminer, crystaltoolkit or robocrystallographer. Hopefully this gives us some confidence that a change would not have side effects at least within this stack.

My recommendation would still be to implement the fix.

mkhorton avatar May 16 '24 03:05 mkhorton

@janosh Can you please suggest on this one?

DanielYang59 avatar May 17 '24 15:05 DanielYang59

@DanielYang59 fwiw, i agree with @mkhorton and the changes you propose in this PR but i prefer for others to make the decision.

given there are precisely zero internal calls to is_rare_earth_metal in pymatgen i imagine the number of call sites in downstream packages that would be affected is low. it's also not clear that those call sites would suffer given this can be considered a fix. they may actually start working as intended by this change

janosh avatar May 17 '24 15:05 janosh

I suggest we do a correct is_rare_earth implementation, and leave is_rare_earth_metal with a deprecation message until 2025.1.1

shyuep avatar May 17 '24 15:05 shyuep

I suggest we do a correct is_rare_earth implementation, and leave is_rare_earth_metal with a deprecation message until 2025.1.1

I like this idea, because rare_earth_metal itself doesn't seem like a widely recognized terminology :)

I would implement this tomorrow

DanielYang59 avatar May 17 '24 16:05 DanielYang59

I hope everyone is happy now :)

DanielYang59 avatar May 18 '24 02:05 DanielYang59

Great, @DanielYang59 ! I am happy as well with this solution!

JaGeo avatar May 18 '24 07:05 JaGeo