pycbc icon indicating copy to clipboard operation
pycbc copied to clipboard

Add missing files to MANIFEST

Open duncanmmacleod opened this issue 1 year ago • 5 comments

This PR updates MANIFEST.in to include all necessary file extensions such that the code snippet from #4103 works end-to-end.

Closes #4103.

WARNING: this increases the size of the distribution from 3.3MB (https://pypi.org/project/PyCBC/2.0.5/#files), to 8.5MB, which be unacceptable.

duncanmmacleod avatar Aug 10 '22 12:08 duncanmmacleod

@duncanmmacleod The extra 5MB is due to verification data files (ie. run code and check output matches verification files). I think we don't want to ship these in the distribution (especially if it hits a size limit).

... Probably a better solution is for me to edit the test scripts so that at least the larger files in the test/data directory are pulled from some storage repo when the test is run (similar to how the data gwf files are downloaded). pycbc-config is a likely candidate for where to store these files.

spxiwh avatar Aug 12 '22 09:08 spxiwh

@duncanmmacleod The extra 5MB is due to verification data files (ie. run code and check output matches verification files). I think we don't want to ship these in the distribution (especially if it hits a size limit).

The limit on PyPI is 100MB I think, so you're fine for a while. But, the smaller the better.

... Probably a better solution is for me to edit the test scripts so that at least the larger files in the test/data directory are pulled from some storage repo when the test is run (similar to how the data gwf files are downloaded). pycbc-config is a likely candidate for where to store these files.

Can the relevant tests be configured to be skipped if the verification data are not present? Or use a mark to enable trivial disabling of the tests?

duncanmmacleod avatar Aug 12 '22 10:08 duncanmmacleod

I think I'd rather just have these tests pull the data they need at runtime. This better matches what's done in examples where we pull data files from GWOSC (or other repositories) at runtime if needed (or in some cases, generate them on the fly). (See PR now linked)

spxiwh avatar Aug 12 '22 13:08 spxiwh

I really want a way to run as many tests as possible but not require an external network connection, so if that PR satisfies this, I'm happy.

duncanmmacleod avatar Aug 12 '22 13:08 duncanmmacleod

I think that after merging #4107 (which I would like to have merged in any case) any test that needs these data files is going to fail without an internet connection.

I then think that using a pytest "mark" seems like the best way to disable tests-that-need-internet. However, as we're still using test files created for unittest I'm not quite sure how this would work. For e.g. the test_tmpltbank class downloads data files in the special "setUp" method. Am I okay to wrap that with a pytest decorator???

spxiwh avatar Aug 12 '22 13:08 spxiwh

Closing this as I don't want to increase the source size by 2.5 times (or perhaps more now that we removed some of the data files).

spxiwh avatar Jul 07 '23 20:07 spxiwh