sbi icon indicating copy to clipboard operation
sbi copied to clipboard

Reuse npse_trained_model fixure for test_npse_map

Open vagechirkov opened this issue 8 months ago β€’ 11 comments

#1428

vagechirkov avatar Mar 17 '25 16:03 vagechirkov

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 84.82%. Comparing base (4d1dc5f) to head (e1fe4bd). :warning: Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1461      +/-   ##
==========================================
- Coverage   87.77%   84.82%   -2.96%     
==========================================
  Files         134      137       +3     
  Lines       11126    11473     +347     
==========================================
- Hits         9766     9732      -34     
- Misses       1360     1741     +381     
Flag Coverage Ξ”
unittests 84.82% <ΓΈ> (-2.96%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more. see 48 files with indirect coverage changes

codecov[bot] avatar Mar 17 '25 16:03 codecov[bot]

CI is failing due to testmon error. this will be fixed in a separate PR

janfb avatar Mar 18 '25 09:03 janfb

What is the speed up you notice through this change?

Basically, we save the time of training a model specifically for `test_npse_map'. But now it would be much cheaper to add more tests that can reuse specific trained models.

vagechirkov avatar Mar 18 '25 09:03 vagechirkov

One concern I have @vagechirkov is that now all the tests here are marked as slow, right?

Is there a way to have one of the c2st tests "not slow"? e.g., the one with the default settings we use in the class?

janfb avatar Mar 18 '25 09:03 janfb

Hey @janfb, I think this is more or less how the final test structure should look like, so ready for review πŸ™ƒ

vagechirkov avatar Mar 20 '25 10:03 vagechirkov

Hey @janfb, I think this is more or less how the final test structure should look like, so ready for review πŸ™ƒ

Cool, I will review it now.

janfb avatar Mar 20 '25 10:03 janfb

An issue with snapshot tests: https://github.com/syrupy-project/syrupy/issues/438

vagechirkov avatar Mar 20 '25 14:03 vagechirkov

@vagechirkov @janfb my suggestion for this PR would be:

  • revert to the previous version of the test file linearGaussian_npse_test.py
  • add pytest.mark.slow to the tests
  • add the regression test (rename from test_snapshot -> linearGaussian_npse_test_regression.py)
  • remove the ambr file (and also clean the history, as I forgot to put it into lfs, my bad:(...)

This would not show the refactoring of the test (which we aimed for at the beginning), but at least it shows the usage of pytest-regression. What do you think?

schroedk avatar Mar 21 '25 18:03 schroedk

Alright, I think I’ve finally tracked down the issue πŸŽ‰πŸ«  (was seriously starting to question my sanity). Turns out it’s all about the combination of parameters β€” nothing to do with how we wrote the tests πŸ€ͺ I used the pytest-timeout plugin with @pytest.mark.timeout(60) to catch it. These are the tests that take longer than 60 seconds to complete:

  • uniform_prior-dim_3-ve-fnpe-sde-trials_8
  • uniform_prior-dim_3-ve-auto_gauss-sde-trials_8
  • uniform_prior-dim_3-ve-jac_gauss-sde-trials_8
  • uniform_prior-dim_3-vp-fnpe-sde-trials_8
  • uniform_prior-dim_3-subvp-fnpe-sde-trials_8

This means, in turn, that this seems to be a separate issue. What do you think, @janfb?

P.S. The reason we didn’t hit this issue in test_spnapshot.py is because there’s actually a bug there β€” we’re not really using the number of trials πŸ™ˆ @schroedk

vagechirkov avatar Mar 21 '25 18:03 vagechirkov

Here is the longer list of slow regression tests:

image

vagechirkov avatar Mar 21 '25 18:03 vagechirkov

Very nice refactoring, thanks @schroedk! Regression tests run locally for me in 4min without the problem reported here: #1473

vagechirkov avatar Mar 23 '25 18:03 vagechirkov

@janfb, great! I am happy to help finish implementing this issue! I'm down to have a quick meeting about this πŸ™‚

vagechirkov avatar Oct 20 '25 09:10 vagechirkov

@janfb, great! I am happy to help finish implementing this issue! I'm down to have a quick meeting about this πŸ™‚

Awesome, let's aim for a meeting this week. How about some time Thu after 10:30? or Friday before noon?

@schroedk @vagechirkov

janfb avatar Oct 21 '25 13:10 janfb

@janfb, great! I am happy to help finish implementing this issue! I'm down to have a quick meeting about this πŸ™‚

Awesome, let's aim for a meeting this week. How about some time Thu after 10:30? or Friday before noon?

@schroedk @vagechirkov

Friday before noon works for me πŸ™‚

vagechirkov avatar Oct 21 '25 14:10 vagechirkov

@janfb, great! I am happy to help finish implementing this issue! I'm down to have a quick meeting about this πŸ™‚

Awesome, let's aim for a meeting this week. How about some time Thu after 10:30? or Friday before noon? @schroedk @vagechirkov

Friday before noon works for me πŸ™‚

Cool! I reached out on Discord and suggested 9am.

janfb avatar Oct 23 '25 19:10 janfb

@vagechirkov I now managed to push the merge of main and some small type fixes I did locally to get this running (pushing was blocked by git-lfs, turning it off temporarily fixed it).

Feel free to undo the changes in the recent commit, some of them are obsolete.

janfb avatar Oct 24 '25 08:10 janfb

I copied the regression tests to linearGaussian_vector_field_test.py, and continuous integration seems happy with them so far πŸ™ƒ The old tests in linearGaussian_vector_field_test.py now need to be refactored. I will work on this in the next few days.

vagechirkov avatar Nov 04 '25 15:11 vagechirkov

Sounds great πŸš€ let me know if you need any input from my side.

janfb avatar Nov 05 '25 08:11 janfb