GalSim icon indicating copy to clipboard operation
GalSim copied to clipboard

make COSMOSCatalog file treatment more general

Open rmandelb opened this issue 8 years ago • 10 comments

I recently had a reason to make a new version of the 25.2 sample COSMOS catalog with some modifications, and discovered that when I tried to read in and use the new catalog, there were a few issues:

  1. I used the same file naming scheme (real_galaxy_catalog_SAMPLE.fits and similar patterns for the file with selection criteria) just with a different SAMPLE. However, the code is not set up to try this naming scheme if you tell it a new value of sample. It'll just complain that it doesn't know how to deal with that sample. I would like it to try to find files with the default naming scheme for that sample, and only if it can't find them in dir should it say it doesn't know what to do for this sample.

  2. If you give it the file_name kwarg, it tries to find the selection file for the default sample (and issues a warning).

I propose to make minor code updates in scene.py to address both of these issues.

rmandelb avatar Sep 01 '16 00:09 rmandelb

@sowmyakth -

To expand a little on what's needed here:

You can basically trace through how the input sample kwarg is handled to see what might be needed. It needs to change so as to allow other sample values, and check whether files exist that follow our usual naming scheme (if yes, use them; if not, complain). However, it should not go so far as to tell you to try to download any of the helper files for that new sample with galsim_download_cosmos; that action should only be taken if sample WAS in the list of samples that GalSim knows about.

However, there is also some work needed if a filename is used to specify the new sample. I'm suggesting that scene.py should be told what the default file naming scheme is, and try to parse the filename to figure out the sample. Then it can look for selection files and/or parametric fits files for that sample (rather than for the default sample).

Does that make sense?

rmandelb avatar Sep 30 '16 14:09 rmandelb

Cross reference Issue #731. Making the COSMOSNoise stuff more generic might be worth doing at the same time.

rmjarvis avatar Sep 30 '16 16:09 rmjarvis

Yep. Though I was also thinking of generalizing #731 not just to COSMOSNoise but also COSMOSCatalog (-> HSTCatalog, with COSMOSCatalog being allowed but deprecated, and it would just call HSTCatalog).

rmandelb avatar Sep 30 '16 18:09 rmandelb

@rmandelb just checking if I understand the goal here. We want to keep the current setting that if sample is given, the catalogs to be read must be named real_galaxy_catalog_sample.fits,real_galaxy_catalog_sample_fits.fits, real_galaxy_catalog_sample_selection.fits. But if we want a catalog with any other name to be read, that should be input with file_name=any_other_name, the primary catalog being any_other_name.fits. The script must be able to deduce that the parametric fits and selection files will be any_other_name_fits.fits and any_other_name_selection.fits.

Is that right?

sowmyakth avatar Oct 07 '16 01:10 sowmyakth

Just realized I made an error in my own naming scheme.

Catalog with any other name to be read, should be input with file_name='any_other_name.fits', and the catalogs to be read will be any_other_name.fits, any_other_name_fits.fits and any_other_name_selection.fits.

sowmyakth avatar Oct 07 '16 01:10 sowmyakth

The first part of your first message is right (we want to keep the current setting for how sample is treated). The second message is also right (how an arbitrary filename should be handled). Or at least, it's how I was hoping it would work; hopefully the othesr will speak up if they see an obvious flaw in this plan.

rmandelb avatar Oct 07 '16 18:10 rmandelb

The second part with file_name kwarg, the name must have a SAMPLE value: file_name='any_other_name_SAMPLE.fits', and the catalogs to be read will be any_other_name_SAMPLE.fits, any_other_name_SAMPLE_fits.fits and any_other_name_SAMPLE_selection.fits.

We need to be able to parse out SAMPLE because that determines the rescaling flux and size in deep sample, and the exclusion_level with ['marginal', 'bad_stamp']. Right now the scaling factors and cut values are set to a specific value for SAMPLE=23.5, and another for all other values for SAMPLE. Does is still hold when using SAMPLE other than 25.2?

sowmyakth avatar Oct 07 '16 19:10 sowmyakth

Sorry, I just noticed I did not answer this.

The deep sample thing is really specific to the 23.5 sample. I would not even allow that kwarg for any random sample defined by someone else.

The exclusion_level stuff should probably not be applied for a random sample, which could be anything.

rmandelb avatar Oct 16 '16 15:10 rmandelb

@sowmyakth - are you still interested in this? If you have no plans to do this in the near future, then I would like to do it in the coming ~week, so please let me know.

rmandelb avatar Dec 22 '16 14:12 rmandelb

I did item 1 of this issue as part of #755, since it actually made it easier to get test coverage of some bits when there is no normal COSMOS download on disk. Leaving this open, since I think there are still some changes required to get item 2 working the way @rmandelb wants.

rmjarvis avatar May 07 '18 02:05 rmjarvis

Done via #755 and #1174

rmjarvis avatar Aug 15 '22 20:08 rmjarvis