jwst icon indicating copy to clipboard operation
jwst copied to clipboard

JP-2694: reset input model after Source Catalog Step

Open cshanahan1 opened this issue 1 year ago • 6 comments

A copy of the input model is not made in SourceCatalogStep since there is no return model (just the table). In the step, the units of the data/err arrays are converted and the background is subtracted from the data. If someone attempts successive runs of the step, it will fail because of these changes to the input.

This PR resets the model's data/err arrays and metadata information about units so that the input remains unchanged.

Resolves JP-2694

cshanahan1 avatar Jul 26 '22 17:07 cshanahan1

Codecov Report

Base: 79.72% // Head: 79.57% // Decreases project coverage by -0.15% :warning:

Coverage data is based on head (3253c34) compared to base (eed296d). Patch coverage: 100.00% of modified lines in pull request are covered.

:exclamation: Current head 3253c34 differs from pull request most recent head 4e4a433. Consider uploading reports for the commit 4e4a433 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6942      +/-   ##
==========================================
- Coverage   79.72%   79.57%   -0.16%     
==========================================
  Files         411      414       +3     
  Lines       37270    37601     +331     
==========================================
+ Hits        29713    29920     +207     
- Misses       7557     7681     +124     
Flag Coverage Δ *Carryforward flag
nightly 79.43% <100.00%> (-0.28%) :arrow_down: Carriedforward from 565f37e
unit 53.22% <100.00%> (-0.02%) :arrow_down:

*This pull request uses carry forward flags. Click here to find out more.

Impacted Files Coverage Δ
jwst/source_catalog/source_catalog.py 98.93% <100.00%> (+0.58%) :arrow_up:
jwst/source_catalog/source_catalog_step.py 90.90% <100.00%> (-2.64%) :arrow_down:
jwst/lib/engdb_mast.py 38.52% <0.00%> (-6.33%) :arrow_down:
jwst/assign_wcs/niriss.py 70.45% <0.00%> (-6.25%) :arrow_down:
jwst/associations/lib/rules_level2b.py 96.70% <0.00%> (-3.30%) :arrow_down:
jwst/ami/utils.py 86.37% <0.00%> (-1.88%) :arrow_down:
jwst/lib/engdb_lib.py 78.12% <0.00%> (-1.88%) :arrow_down:
jwst/tweakreg/tweakreg_step.py 63.84% <0.00%> (-1.55%) :arrow_down:
jwst/assign_wcs/assign_wcs.py 79.62% <0.00%> (-1.08%) :arrow_down:
... and 44 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Jul 26 '22 18:07 codecov[bot]

Don't forget to add the JP ticket number

nden avatar Jul 26 '22 18:07 nden

The RTD failure is unrelated. It appears the RTD build has been failing for the past ~3 days.

larrybradley avatar Jul 26 '22 19:07 larrybradley

Wouldn't it be way simpler to just go ahead and make a copy of the input datamodel and work on the copy? I know we've been trying to avoid making too many, unnecessary copies of data in order to not overload memory usage, but for this step it's a single image being used as input. Granted, that image could be a big mosaic, but still just a single ImageModel.

You don't even need to copy the whole data model, just the two or three data arrays from it that are actually used in the step.

hbushouse avatar Jul 27 '22 13:07 hbushouse

@hbushouse i had originally just made it a copy but larry suggested avoiding that if possible. what do you think? it would just need to copy the data/err arrays, then reset the unit in the header.

cshanahan1 avatar Jul 27 '22 13:07 cshanahan1

I wanted to avoid making copies if possible because the mosaics could be very large. Copying requires more memory usage and more cpu time, which I thought was unnecessary given this is the last step of the pipeline. I hadn't considered that an offline (not operations) user would want to re-run this step multiple times in succession -- but I don't really know if that's a common use case. The changes here are straightforward (and all done in place).

larrybradley avatar Jul 27 '22 14:07 larrybradley

in the original test i wrote for this, which was failing, there were a good number of pixels (all located in the bright source in the test image) that exceeded the 1.e-7 threshold to pass assert_allclose. i didn't want to just brush a potential issue under the rug by increasing the tolerance, but after discussing with @larrybradley we decided that the discrepancy was probably because the input data is float32 while the background array is float64. i increased the tolerance to 5.e-7 and the test passes.

cshanahan1 avatar Sep 15 '22 17:09 cshanahan1