lmfit-py
lmfit-py copied to clipboard
Add Pearson4 fit model (original Wikipedia version)
Description
This adds the Pearson4 fit model (original version from Wikipedia), which is a skewed version of Pearson7.
Type of Changes
- [ ] Bug fix
- [x] New feature
- [ ] Refactoring / maintenance
- [ ] Documentation / examples
Tested on
Python: 3.10.4 | packaged by conda-forge | (main, Mar 30 2022, 08:38:02) [MSC v.1916 64 bit (AMD64)] lmfit: 0.0.post2625+g0b9947a.d20220808, scipy: 1.9.0, numpy: 1.23.1, asteval: 0.9.27, uncertainties: 3.1.7
Verification
Have you
- [x] included docstrings that follow PEP 257?
- [x] verified that existing tests pass locally?
- [x] verified that the documentation builds locally?
- [x] squashed/minimized your commits and written descriptive commit messages?
- [ ] added or updated existing tests to cover the changes? (not neccessary, new model))
- [X] updated the documentation and/or added an entry to the release notes (doc/whatsnew.rst)?
- [ ] added an example? (not neccessary, only a model)
@lellid Thanks! This looks like a very good start to me, and I'll try to take a more careful look at it. FWIW, I didn't understand the failures (files were modified by isort
?). Maybe @reneeotten can comment on that...
FWIW, I didn't understand the failures (files were modified by
isort
?). Maybe @reneeotten can comment on that...
I can, the changes/additions to import packages do not follow the guidelines of the pre-commit hooks we use. That's something that could be verified locally before submitting a PR, but in short the diff for lineshapes.py
in the imports should read like:
-from numpy import (arctan, copysign, cos, exp, isclose, isnan, log, pi, real,
- sin, sqrt, where)
+from numpy import (arctan, copysign, cos, exp, isclose, isnan, log, log1p, pi,
+ real, sin, sqrt, where)
+from scipy.special import betaln as betalnfcn
from scipy.special import erf, erfc
from scipy.special import gamma as gamfcn
+from scipy.special import loggamma as loggammafcn
from scipy.special import wofz
@reneeotten Ah, thanks -- "imports out of preferred order"!
@reneeotten Ah, thanks -- "imports out of preferred order"!
well... the main thing is you're not supposed to have several "import ..... as ....." on one line as was done in this PR:
from scipy.special import gamma as gamfcn, loggamma as loggammafcn, betaln as betalnfcn
Hi,
I just updated this PR. I
- added tests (code coverage is not complaining any more)
- put the import statement in lineshapes.py on separate lines
- improved the documentation in models.py
- added Wiki link in lineshapes.py
@lellid @reneeotten if I understand correctly, we're all in favor of merging this, except for some "not-quite-as-beautiful-as-we-like" import
statements. Please fix those so we can merge.
I used iSort to sort the using statements, so now it passes the tests.
@lellid Thanks! I'll plan to merge in a couple of days to give time for final comments or suggestions.
@lellid Thanks! I'll plan to merge in a couple of days to give time for final comments or suggestions.
thanks @lellid - for me the two comments about the docstrings (I left them again above) remain, after making these changes - or explaining why I'm wrong ;) - the commits should be squashed into one - then it's ready-to-go!
@reneeotten @lellid Yes, let's please have consistent code and docstrings! I think we are happy to merge this, but we kind of need a version we can support, which does include being able to explain the actual definition used. We discussed this earlier at #799, but this PR was created as a separate PR after that discussion.
To reiterate, I strongly suggest following the Wikipedia definition or having a really clearly documented explanation of why that form was used instead of the Wikipedia definition.
Hi,
I consolidated the duplication of the docstring.
To the other question about the Wikipedia definition and my implementation:
It is simply a numerical thing:
Of course I can use a naive implementation of $(1+arg^2)^{-m}$, but guess what happens if m is 1000 and arg is 10? Underflow. Or for instance, what is $Gamma(1000)$? Overflow! Finally, both terms together would give some reasonable result, but first, you get underflow multiplied by overflow.
So I logarithmize everything, and put all terms then in the $\exp$ function. For instance, $(1+arg^2)^{-m} = \exp(-m \times \log(1+arg^2)) = \exp(-m \times log1p(arg^2))$
I have seen other rather naive implementations in your code (for instance, Pearson7). Maybe, in the future, I will make some pull requests to correct that.
I consolidated the duplication of the docstring.
thanks!
To the other question about the Wikipedia definition and my implementation:
It is simply a numerical thing:
Of course I can use a naive implementation of (1+arg2)−m, but guess what happens if m is 1000 and arg is 10? Underflow. Or for instance, what is Gamma(1000)? Overflow! Finally, both terms together would give some reasonable result, but first, you get underflow multiplied by overflow.
So I logarithmize everything, and put all terms then in the exp function. For instance, (1+arg2)−m=exp(−m×log(1+arg2))=exp(−m×log1p(arg2))
That might be clear to you since you've probably stared at that equation for a while and looked at the math, but from a first read it wasn't to me. I also haven't checked that this is all implemented correctly, but I assume that you did... Of course all of this could be considered a "me problem" but perhaps you could just add a note to the function in lineshapes.py
to mention it; just so that one doesn't need to go back to this PR in a few months/years to remember this ;)
I have seen other rather naive implementations in your code (for instance, Pearson7). Maybe, in the future, I will make some pull requests to correct that.
Well, naive or not it appears to have been working well for people for quite some time - of course improvements are always welcome.
Thanks again @lellid - this PR should be almost at the finish line.
Hi,
what I was really wondering about was why you don't have any tests for the function values in your code? At least I haven't seen any. Would make those numerical tweakings a lot more secure, if you can be sure that the values come out that you expect.
@lellid
I consolidated the duplication of the docstring.
OK, thanks. Well, at some point we might fix the capital "I" to "j" (Python's preferred sqrt(-1)).
To the other question about the Wikipedia definition and my implementation:
It is simply a numerical thing:
Of course I can use a naive implementation of , but guess what happens if m is 1000 and arg is 10? Underflow. Or for instance, what is ? Overflow! Finally, both terms together would give some reasonable result, but first, you get underflow multiplied by overflow.
So I logarithmize everything, and put all terms then in the function
These functions are kind of prone to under/overflow. I would not necessarily agree that it is obvious that taking log
and then doing exp
of that will help in general, but that's to say that it would not help here...
I have seen other rather naive implementations in your code (for instance, Pearson7). Maybe, in the future, I will make some pull requests to correct that.
Do you mean to use scipy.special.beta
instead of using the equivalent with gamma
? Yeah, that would be okay.
But, I'm not sure that the general approach of "work in log space and then take exponential" is necessarily worth the complication. Readability definitely counts.
@lellid
what I was really wondering about was why you don't have any tests for the function values in your code?
Probably because we feel like we don't need those specifically.
The tests of fitting with these functions do test the values of parameters, so if the functions break, that will very likely be reflected in the fit results.
At least I haven't seen any. Would make those numerical tweakings a lot more secure, if you can be sure that the values come out that you expect.
Well, I guess we expect that we are not changing the lineshape definitions very much ;)
@reneeotten @lellid I'm just going to squash and merge this. If there are more issues to resolve, they can be done afterward.