cctbx_project icon indicating copy to clipboard operation
cctbx_project copied to clipboard

Incomplete simtbx tests

Open dagewa opened this issue 5 years ago • 7 comments

The simtbx tests tst_nanoBragg_basic.py and tst_nanoBragg_pilatus.py do not appear to be complete. In both cases the tst_all() function exits before a comparison is made between the simulation and the expected results:

https://github.com/cctbx/cctbx_project/blob/1c6e68e972873ead894c4640af8e386a2550547a/simtbx/nanoBragg/tst_nanoBragg_basic.py#L224-L226

OK, I see now that these tests are not run by simtbx's run_tests.py. Is the intention that the tests would be completed and added to this?

dagewa avatar May 03 '19 09:05 dagewa

Yes, these are "tst"s in the making. Development stalled because I can't find a good way to test the difference between observed and expected images. We are not allowed to store "binary" data in the repository, and md5 sums don't work because random number generation is platform-dependent. I have found at least three different random number sequences from the same seed, depends on mac vs windows vs linux, and two different versions of glibc on linux. Mostly due to implementation of exp(). Autoindexing is not a good test because it requires roping in DIALS, which is a different project. Can't have dependencies outside the project for a test. Autoindexing is also very sensitive to small changes in the images. Changing crystal orientation only (tst_nanoBragg_basic.py random 1234) I get a 75% success rate in autoindexing. Not sure why that is, but I imagine it will change over time.

So, given all these constraints, the best validation I can think of is to define pixel-by-pixel "test points" to confirm pixels we know should be strong vs weak. I did this in tst_nanoBragg_mosaic.py . So, that is a beginning.

jmholton avatar Aug 28 '19 15:08 jmholton

How big are the files we are talking about? How well do they compress? Are they useful for other people?

For example there is the option if they are not too big or compress well to store them in https://github.com/dials/data-files, or if they are either big or don't compress well but are useful for others to release them on zenodo and then use dials-data to distribute them for tests.

Anthchirp avatar Aug 28 '19 20:08 Anthchirp

Thanks Markus,

I made this example: https://github.com/cctbx/cctbx_project/blob/master/simtbx/nanoBragg/brunger_example/run_example.py to match a real-world XFEL Mar325 image, which I have stored here: https://bl831.als.lbl.gov/~jamesh/simtbx/F4_0_00008.mccd.gz It is 33 MB (12 MB compressed) so not quite as big as this file: https://github.com/cctbx/cctbx_project/blob/master/scitbx/math/fliege_mayer_900_ylm.py

Noiseless images compress very well (factors of 300 or more), but noise-added images don't.

I'm not sure how this works. If a test requires downloading a file from a 3rd party website and the test is "required", does that mean the Jenkins build will fail (or hang?) if the 3rd party website is down? I've never been brave enough to try that.

jmholton avatar Aug 28 '19 22:08 jmholton

What if we just picked 20 or 30 ROIs on the image near Bragg spots and store the ROIs for comparison. We could also make a simtbx_regression module which is for heavy duty testing of full images, but it would be optional for users to download, like xfel_regression.

Also, for large file version control I have had success with git-lfs, but not sure if that's a direction we want to go. It is also optional, users who install git-lfs can use it, but if a user doesnt use git-lfs, then the large files appear in the repo as text files with a brief description of the contents. It installs pretty well on most servers (I got it running on Cori for example), but I would consider it still in development stages.

What happens if we realize the images we store for test comparisons were computed with bugs in the code ? Presumably we would recompute the images, but then the history of the repo starts getting polluted with big files.. Is this a reason for concern ?

dermen avatar Aug 29 '19 18:08 dermen

In general terms I'd discourage checking in binary files to the cctbx_project repo itself. Xfel_regression is not ideal as it is not publicly accessible. I don't have an affirmative solution, but I'd suggest focusing on public locations that are structured so that it is not burdensome if bug fix versions are checked in.

Nicholas K. Sauter, Ph. D. Senior Scientist, Molecular Biophysics & Integrated Bioimaging Division Lawrence Berkeley National Laboratory 1 Cyclotron Rd., Bldg. 33R0345 Berkeley, CA 94720 (510) 486-5713

On Thu, Aug 29, 2019 at 11:04 AM Derek Mendez [email protected] wrote:

What if we just picked 20 or 30 ROIs on the image near Bragg spots and store the ROIs for comparison. We could also make a simtbx_regression module which is for heavy duty testing of full images, but it would be optional for users to download, like xfel_regression.

Also, for large file version control I have had success with git-lfs https://git-lfs.github.com/, but not sure if that's a direction we want to go. It is also optional, users who install git-lfs can use it, but if now, the large files appear in the repo as text files with a brief description of the content. It installs pretty well on most servers (I got it running on Cori for example), but I would consider it still in development stages.

What happens if we realize the images we store for test comparisons were computed with bugs in the code ? Presumably we would recompute the images, but then the history of the repo starts getting polluted with big files.. Is this a reason for concern ?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cctbx/cctbx_project/issues/330?email_source=notifications&email_token=ADQ24VWYOFTDNMMGJVDMQLDQHAFTLA5CNFSM4HKSJUKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5PKVNI#issuecomment-526297781, or mute the thread https://github.com/notifications/unsubscribe-auth/ADQ24VRGSXGKO2KGGXYI3I3QHAFTLANCNFSM4HKSJUKA .

nksauter avatar Aug 29 '19 21:08 nksauter

Over this side of the pond we have addressed this with dials_data module - it is distributed via pip and available to the public, and is essentially a meta-store since it points to data on the internet:

https://github.com/dials/data

Adding data is detailed at

https://dials-data.readthedocs.io/en/latest/installation.html

and this allows you to have a common, central data location (say) which can be shared for all users and kept up to date, which I can see being useful for e.g. LBL developers on central systems.

This avoids carrying around two copies or more of every file as you would with e.g. SVN. @Anthchirp developed this to allow external developers access to all our public test data.

graeme-winter avatar Aug 30 '19 05:08 graeme-winter

As discussed previously you absolutely don't want binary files in your development repository because as soon as they are in the history tree of a permanent branch they will stay there forever and will be re-downloaded on every git clone. SVN is slightly less problematic as it only clutters the server but still means that for every 1 MB worth of current file you have to keep 2 MB on your drive. I actually looked into git-lfs but rejected it due to its indeterministic cost, ie. you pay for downloads as well.

So for @jmholton's concrete example you would create a dials-data dataset definition which is a YAML file containing the file's URL and a bit of metadata (example), submit that as a pull request. It is then automatically validated and a second YAML file is created, which contains file integrity information (example). Once both files are added to the dials-data release you can then use it in tests like this:

https://github.com/dials/dials/blob/acb12b9ece8cde7e6d6f3796aa41e624ca2358c4/test/command_line/test_apply_mask.py#L6-L8

During testing dials-data works as follows: If the requested files are already downloaded to disk or are in a shared location then the test just runs. Otherwise the test transparently downloads the files. If it can't download the files then depending on the context and your configuration it will cause the test to fail or skip, but not hang. It will not download any datasets that are not requested.

Benefits:

  • It keeps binary data out of the code repository.
  • Only a single copy of only the relevant data is held on disk.
  • It allows sharing of this single copy between test runs, between installations, and between machines.
  • All test data are public.
  • All changes to the test data are recorded in the version history.
  • Using the files in tests is super easy.

Anthchirp avatar Aug 30 '19 07:08 Anthchirp