symfit
symfit copied to clipboard
Improve test_finite_difference and test_fit_result
Hey guys,
I worked on test_finite_difference and test_fit_result so that they correspond more to the pytest style.
Some words to 'test_fit_result.py': It's not possible to use a fixture in Parameters, that's why I used a dict to store an call the results. I didn't touch the last test, because the converted version would be more confusing than helpful.
Merry Christmas and a Happy New Year
The test test_constrained/test_constrained_dependent_on_matrixmodel didn't fail because of this PR. If I clone the master branch, the test doesn't work either.
I could compress the fixtures to three: parameters a, b and the result_dict. In this case, all the results would be built in a single fixture, which returns a dict with all of them. The parameters a and b are used the test-method test_fitting.
By the way, I inquire about Fixtures with several return-items.
I stick to the solution with the dictionary after long research :D
I stick to the solution with the dictionary after long research :D
Could you share some of your considerations? That way we can have a short discussion about it here, without me also having to do the research.
Just for clearness: You want to create a parametrized fixture. The params of the fixture are the different results of the models. Further, the fixture will be applied to different tests. But now it is difficult to create tests, with each model having a different comparison partner. In this case, there should be a new parallel parameterization. Besides, models often have to be left out (or only one model is tested) and in my eyes, you quickly lose the overview. The tests are not only made for TravisCI, but also for people who need to understand :D
In my eyes, the dictionary allows models to be added speedy, tests to be implemented easily and clearly, and bugs to be viewed quickly.
Very clear, thanks for the write-up. To sum up: the tests are too diverse? Feel free to reorganise/change the tests so that they become more uniform, if you think that makes sense.
As for the tests in finite_difference, we'll need to find a way to use models in pytest.mark.parametrize without making a million different fixtures. (e.g. @pytest.mark.parametrize('model', [modelA, modelB, modelC, ...])). I don't quite know how to do this yet, and I'm hoping you have a good idea :) We'll need it for the rest of the tests anyway.
On the one hand it's difficult to uniform the tests. For example, only two results are using constraints. We have to separate them. Also chained_result steps often out of line.
On the other hand the tests can be more uniformed than I initially thought. In some cases, a model didn't mention even though it would be passed.
EDIT: I thought a long time, that it's not possible to put fixtures in a parametrizations, because it was a feature request some years ago, but It didn't realise that an offshoot of Pytest can do it. Now I would have with pytest-cases some new possibilities, for example to replace the dict. However, it is not possible with pure pytest...
I never said it would be easy :)
For constraints: you might be able to parametrize them as [a==b] and use [] for no constraints. Or just None.
ChainedMinimizers and their results are a bit magic, so I can see those becoming a separate case.
And let's stick to pure pytest for now and see what we can cook up. It may be an idea to create an alphabet of parameters (a-e) / variables (x-z) in the global namespace, and use those to make models for parametrized tests?
It's never easy ;-) Some ideas...
- I saw that if a model was initiated with
Noneconstraints than it gets an empty list. We could use it, because the test, which checks the constraints, uses a for-each loop. - I just want to know if I understood it right: models with ``ChainedMinimizer
useLeastSquares `as objective, right? - It's not beautiful, but we could test the minimizers like this without leaving
chained_resultout.
assert isinstance(result_name.minimizer, result_minimizer)
if isinstance(result_name.minimizer, ChainedMinimizer):
for minimizer, cls is zip(result_name.minimizer.minimizers, [BFGS, NelderMead]):
assert isinstance(minimizer, cls)
(Maybe we could connect the assert statement with the if statement)
- I saw that if a model was initiated with
Noneconstraints than it gets an empty list. We could use it, because the test, which checks the constraints, uses a for-each loop.
Sure. You could even do a if constraints: ..., which will be False for both None and an empty list.
- I just want to know if I understood it right: models with
ChainedMinimizeruseLeastSquaresas objective, right?
Not necessarily. Maybe in the current tests, but it's not generally true.
- It's not beautiful, but we could test the minimizers like this without leaving
chained_resultout.
I then prefer kicking all chained minimizer tests to either a separate test, or even a separate file. As you said earlier: making the tests understandable is at least as important.
Allright....Then I have ean idea of the new test_fit_result file :)
But I won't work on it until next year :P
Guten Rutsch :)
Hey, I'm so sorry I was away for so long. Papers, courses and exams stood between the pull request and me. Now I have no such big projects until October. Perhaps this conversion will be done by then (at least that's my goal).
Does it make sense to continue working on this PR, or have there been many new tests in the meantime?
Heyhey,
it's been quiet here as well. If you want to pick this up again then by all means go ahead. Just to make sure, merge master first, but I don't expect any changes.
Bevor I start, I want to sum up the requested changes:
test_finite_difference
Parametrizing fixtures should be applied to the method test_model. The problem was the number of required fixtures for the variables and parameters. A solution could be an alphabet in conf.py.
test_fit_result The goal is one big method were the six models will be tested. Problems were the missing feature by pytest and diversity of the models.
So far correct?
I think so, yes. Start with the simpler one (test_finite_difference) before diving into the other. Maybe something like this would work:
@pytest.mark.parametrize("model, expected_values, initial_values, xdata, noise", (
[a*x**2 + b*x + c, {'a': 1, 'b': 2, 'c': 3}, {'a': 0, 'b': 0, 'c': 0}, {'x': np.arange(10)}, 0.05]
)
def test_something(model, expected_values, initial_values, xdata, noise):
if not initial_values:
initial_values = expected_values.copy()
y_data = model(**xdata, **expected_values)
y_data *= np.random.normal(loc=1, scale=noise, shape=y_data.shape) # doesn't actually work, but you get the idea
... # more things here
assert result == expected_values
Alright!
The next commit will contain the mentioned collection conf.py of variables and parameters + the integration into the individual test files.
I'm still contemplating whether we can condense the rest of the tests in finite_difference into the one parameterised case as well, and whether or not that's a good idea. It'll result in a test method with a lot of arguments...
In my opinion, test_ODE_stdev could be built into the big method. But the other methods test many more aspects than we can build them in.
Edit: Question: Why is the test that checks, if ODE models generate standard deviation, in test_finite_difference and not in test_ode? Otherwise, we could move this part to test_ode and additionally add an ODE_model to the big parameterization 🙆♂️
By the way, I found an unconverted test. Should I convert it in this PR (after fit_result and finite_difference) or later?
Edit: Question: Why is the test that checks, if ODE models generate standard deviation, in test_finite_difference and not in test_ode? Otherwise, we could move this part to test_ode and additionally add an ODE_model to the big parameterization 🙆♂️
The original reason is that ODE models require the finite difference methods to be able to calculate a std deviation (because they're not analytically differentiable). I'm ok with reorganising this though, but it would be good to test that we can calculate the Jacobian of an ODEModel, and possibly the stddev.
By the way, I found an unconverted test. Should I convert it in this PR (after fit_result and finite_difference) or later?
Leave it for later. Keep PRs small.
Leave it for later. Keep PRs small.
👍
it would be good to test that we can calculate the Jacobian of an ODEModel, and possibly the stddev.
I added an ODEModel to the parameterization. Checking stddev would require to evaluate fit_result of every model. Then we again have the problem that we can hardly separate the datat-arrays from the parameters. As long as we don't have a nicer variant for the arrays, we can hardly test stddev.
I'm struggling with the failed check by TravisCI. The test passes locally, but not in CI. In case of c++ I would have put it on the compiler, but in python?
(sf.Model({y: 3 * a * x**2 + b * x * w - c, z: sf.exp(a*x - b) + c*w}), {'a': 3.5, 'b': 2, 'c': 5, 'w': np.arange(10)/10}),
is failing.
It looks like a numerical issue: https://travis-ci.org/github/tBuLi/symfit/jobs/702160446#L380
You can either change the values of the parameters/variables to be in a region that is numerically more stable, or relax the tolerance a bit.
I am currently working on test_fit_result. I noticed some small things:
-
The results (parameters a, b + r²) differ strongly If I calculate them with a normal Fit, a Fit with likelihood, or constraints (surprise). I could calculate them with WolframAlpha (or other) and implement the solutions (?)
-
likelihood has no r^2 understandably. I would check that with hasattr in the test:
if hasattr(fit_result, 'r_squared'):
assert isinstance(fit_result.r_squared, float)
...
-
I had asked before if ChainedMinimizer always uses LeastSquare as a minimizer. Even if it is not by default, should I still include it in the test?
-
Currently we only test
fit_result,minpack_resultandlikelihoodfor the existence of r^2, objective_result,... I would add the remaining Fits so that everything fits into one test.
I expect test_fit_results to require much more invasive/drastic changes than test_finite_difference.
- I think this is expected (since they're different problems), but which test are you referring to exactly?
- See also 4.
- ChainedMinimizer is just another minimizer, and can be used in combination with any other Objective (such as LeastSquares). ChainedMinimizers are a little funny though, since it runs multiple minimizers in order which affects the produced fit_result.
- Great idea. Combined with 2, maybe you can parametrize the tests with a dict of expected attribute/value pairs, and use
getattr(for attr_name, value in attributes.items(): assert getattr(attr_name) == pytest.approx(value))
I uploaded the first draft to talk with examples (with described failures and without comments) :D
- Because we use the constraint
Eq(a, b)in two Fits the results don't fit perfectly to the datasets. And likelihood is likelihood :P - & 4. A dict for the fit results would be helpful too I guess. But I like the idea of
getattr! - So ChainedMinimizer is the special kid in the family of Minimizers ;)
In every case, we should calculate the results with WolframAlpha to get a, b,r^2,....
Hi guys, thanks for keeping up the good work, I've been a bit swamped with work regarding the last stages of my PhD so I've struggled to find time to look into this. I trust @pckroon's verdict on this.
Be sure to remove the German comments from the code before merging though, das ist ja wahnsinn! :P
p.s. I'm loving the reduction in lines of code, is this achieved without loss of functionality?
So ChainedMinimizer is the special kid in the family of Minimizers ;)
Definitely.
In every case, we should calculate the results with WolframAlpha to get a, b,r^2,....
The values as they are in the tests now should be correct/validated somehow. But feel free to double check them.
p.s. I'm loving the reduction in lines of code, is this achieved without loss of functionality?
Jup. That's the magic of pytest parametrization and one of the reasons why I've been pushing for this. In addition, it helps check that you cover all combinations.
Be sure to remove the German comments from the code before merging though, das ist ja wahnsinn! :P
Of course :D I pushed a draft including my todos :)