photutils
photutils copied to clipboard
IterativePSFPhotometry: high memory usage due to a deepcopy?
Thanks for developing the PSFPhotometry
and IterativePSFPhotometry
classes! I have been using them to extract stars from JWST images and they appear to be working quite well. However, I noticed that the IterativePSFPhotometry
class takes up a lot of memory, making it very difficult to apply to dense star fields (~50,000 - 60,000 stars). It appears to use significantly more memory than v1.8's IterativelySubtractedPSFPhotometry
class, which is now deprecated.
I think the issue might be that there is a deepcopy
call inside IterativePSFPhotometry
which duplicates the PSFPhotometry
object after a round of star-fitting is completed. This effectively saves the output from that iteration and then the PSFPhotometry
object is reused for the next round of star-fitting.
To show this, I attach a plot of memory usage vs. time for a single iteration of star-finding with the IterativePSFPhotometry
object (maxiters=1, grouper=None, using WebbPSF PSF model). This is run on a sub-image which contains ~5500 stars. The long positive slope from 25 - 250s is from the PSF fitting via PSFPhotometry
, which only uses ~25% more memory than the old IterativelySubtractedPSFPhotometry
did in v1.8. However, afterward we see the sharp spike in memory due to the deepcopy
. I also attach a screenshot of the line-by-line memory profile of IterativePSFPhotometry
which indicates this.
So, is there a way we can avoid the deepcopy
of the PSFPhotometry
object? Or, can we do the deepcopy
of the PSFPhotometry
object after it is initialized but before any fitting is done, so the fit outputs aren't duplicated as well? Currently I'm doing a hack where I initialize a new PSFPhotometry
object for each star-finding iteration rather than calling (and overwriting the results of) the existing PSFPhotometry
object. It isn't pretty but seems to work OK. Thanks!
Python: 3.10 photutils: 1.10.0 astropy: 6.0.0 numpy: 1.25.2 Operating system: macOS 12.5
@mwhosek Many thanks for the detailed report. Can you please run your code and profiling again using the current dev version of Photutils (e.g., pip install -U "photutils[all] @ git+https://github.com/astropy/photutils.git"
) and report back?
There was a memory leak in copying GriddedPSFModel
objects that I fixed in https://github.com/astropy/photutils/pull/1679. That fix hasn't made it into a release yet. That would explain why your PSFPhotometry
objects are so large. They are supposed to be relatively lightweight (even for 60k sources).
@larrybradley thanks for the response. The dev version (1.10.1.dev92+g232edaed) works great! The PSFPhotometry object only uses ~0.5 GB now compared to ~10 GB before. This makes life much easier :).
The dev version appears to run 1.5x slower than 1.10.0, in case that is a reason for concern. But I'd gladly trade the computing time for the memory improvement.
Thanks, @mwhosek. I released v1.11.0 on Friday with the memory fix. I'm curious about your slowdown. My test case (1000 GriddedPSFModels in a 4k x 4k image) with the new code actually ran ~1.4x faster. In any case, I have ideas for further performance improvements (incl. multiprocessing).