astropy icon indicating copy to clipboard operation
astropy copied to clipboard

Compound models use excessive memory

Open keflavich opened this issue 1 year ago • 1 comments

Description

Creating a large compound model will result in using huge amounts of memory and/or reach the maximum recursion depth. The amount of memory used by very small models is very large.

Expected behavior

The amount of memory used by compound models should be linear with the number of included models

How to Reproduce

Run the following example:

from astropy.modeling.models import Gaussian2D

dao_psf_model = Gaussian2D()

print()
# try copying models a bunch
models = []
x = 1000000
from tqdm.auto import tqdm
for i in tqdm(range(x)):
    models.append(dao_psf_model.copy())
    if i == 0:
        psf_model = dao_psf_model
    else:
        psf_model += dao_psf_model
print(f"Finished making {x} model copies")

I ran tests with 4, 8, 16, 32 GB memory:

  0%|          | 1299/1000000 [00:26<8:53:21, 31.21it/s]
  0%|          | 1836/1000000 [00:51<12:42:30, 21.82it/s]
  0%|          | 2519/1000000 [03:06<35:56:14,  7.71it/s]
  0%|          | 2987/1000000 [03:19<18:28:11, 14.99it/s]

The first 3 fail by running into memory limits. The last ends with

Traceback (most recent call last):
  File "/blue/adamginsburg/adamginsburg/repos/astropy/astropy/modeling/core.py", line 4200, in make_subtree_dict
    make_subtree_dict(tree.left, nodepath + "l", tdict, leaflist)
  File "/blue/adamginsburg/adamginsburg/repos/astropy/astropy/modeling/core.py", line 4200, in make_subtree_dict
    make_subtree_dict(tree.left, nodepath + "l", tdict, leaflist)
  File "/blue/adamginsburg/adamginsburg/repos/astropy/astropy/modeling/core.py", line 4200, in make_subtree_dict
    make_subtree_dict(tree.left, nodepath + "l", tdict, leaflist)
  [Previous line repeated 996 more times]
  File "/blue/adamginsburg/adamginsburg/repos/astropy/astropy/modeling/core.py", line 4196, in make_subtree_dict
    if not hasattr(tree, "isleaf"):
RecursionError: maximum recursion depth exceeded

Note that performance degrades rapidly before the memory runs out.

Versions

Tests were run on:

Linux-4.18.0-513.24.1.el8_9.x86_64-x86_64-with-glibc2.28
Python 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0]
astropy 6.1.dev108+gaa142496ef
Numpy 1.23.5
pyerfa 2.0.0
Scipy 1.9.3
Matplotlib 3.6.2

cc @larrybradley as this affects large groups in photutils.

keflavich avatar Jul 10 '24 20:07 keflavich

I think it might make sense to consider, at least for the additive case, a separate compound class that has a 'flat' structure, i.e. internally just has a list of models to add, and no tree? I think this is a pretty common use case. Or maybe there is a way to work that into the current compound model class.

astrofrog avatar Jul 10 '24 21:07 astrofrog

Hello, just wanted to ping on this issue -- this is affecting the downstream issues (mentioned by @larrybradley above) that require unreasonably large RAM for PSF grouping, making what would be a simple PSF photometry into an expensive process. @astrofrog Do you think it's possible to implement your idea? It would boost the productivity of crowded-field PSF photometry and many other applications. Thanks!

SterlingYM avatar Jan 17 '25 16:01 SterlingYM

cc @astropy/modeling

pllim avatar Jan 17 '25 20:01 pllim

I think it'd be worth trying the flat structure compound model, I don't have much time this month but might be able to look in February if no one else has time before.

astrofrog avatar Jan 17 '25 22:01 astrofrog

@astrofrog Have you had a chance to look into this?

larrybradley avatar Feb 24 '25 15:02 larrybradley

No not yet, sorry!

astrofrog avatar Feb 24 '25 15:02 astrofrog

@nden just asked me to look into this. @astrofrog's suggestion maybe a good one.

I wonder if it is possible to flatten the entire thing rather than having the massively nested tree?

WilliamJamieson avatar Feb 28 '25 16:02 WilliamJamieson

I wonder if model sets can be leveraged. Basically, something like SummationModel(my_model_set).

WilliamJamieson avatar Feb 28 '25 16:02 WilliamJamieson

I think that would only work if all models in the sum are the same, but maybe that's acceptable as then if one wanted say a constant plus 1000 Gaussians we'd use the new class for the 1000 gaussians then make a regular compound model of that with the constant.

astrofrog avatar Feb 28 '25 16:02 astrofrog

@larrybradley indicated to me that the case he is interested is adding the same model type together.

WilliamJamieson avatar Feb 28 '25 16:02 WilliamJamieson

@larrybradley indicated to me that the case he is interested is adding the same model type together.

The model set may let us vectorize the evaluation, which may also allow for speed ups.

WilliamJamieson avatar Feb 28 '25 16:02 WilliamJamieson

| @larrybradley indicated to me that the case he is interested is adding the same model type together.

Yes, the background is assumed to have already been subtracted. For the photutils PSF-fitting case, all the models are the same. However, that may not be true for other users/use cases.

larrybradley avatar Feb 28 '25 16:02 larrybradley

I haven't directly checked, but I'm guessing each sum is making a copy of the previous compound model, and if so, the memory grows (as well as the execution time) by a factor of the square of the number of models. For a million models, that is a reasonably large number ;-).

One possibility is to develop an addition model that takes a list of models and sums them in one evaluation call (with one call per model, of course). They don't have to be identical models.

A model set approach could be even faster, but is more constraining.

If I'm correct about the cause, would this be acceptable?

perrygreenfield avatar Feb 28 '25 19:02 perrygreenfield

The model set may let us vectorize the evaluation, which may also allow for speed ups.

That's a good point - note that we don't actually have to solve the general case here. It is perfectly acceptable to make a class that is optimised for identical models and it doesn't stop us from making another class in future that is more efficient that CompoundModel but more general (but less performant) than the model set approach. We just have to make the limitations clear in the name of the class.

So I think trying the model set approach first is fine.

astrofrog avatar Feb 28 '25 19:02 astrofrog

More specifically, the growth in memory may be in the mapping of parameters from the constituent models that make up the intermediate compound model. If it is doing what I think it is, the number of parameter objects grows as n squared.

perrygreenfield avatar Feb 28 '25 19:02 perrygreenfield

@keflavich I have written https://github.com/WilliamJamieson/photutils/blob/feature/summation_model/photutils/psf/summation_model.py and integrated it into photutils for their psf fitting. It needs some work to be a general solution, but it looks like it works a little better and it can be fit using the astropy fitters.

from astropy.modeling.models import Gaussian2D
from photutils.psf.summation_model import SummationModel


def psf_make_model_set(base_model, n_models):
    _base_model = base_model()

    params = {"n_models": n_models}
    for name in _base_model.param_names:
        value = getattr(_base_model, name).value
        params[name] = [value] * n_models

    return base_model(**params)

n_models = 10_000
dao_psf_model_set = psf_make_model_set(Gaussian2D, n_models)
psf_model = SummationModel(dao_psf_model_set)  # works

I found that using the default python setup on my local computer that the reproducer failed even for 1_000 models let alone the 1_000_000 models requested. For 10_000 models this took around ~6sec, which is not great but it is better than a recursion depth error.

Note I have not explored fitting with a large number of models but the photutils tests do pass using it with fits of 7+ models so fitting "works" in a limited sense.

I think this solution is more of a bandaid to the specific PSF fitting case rather than the general excessive memory use and recursion issues in modeling. During my investigation I did find that https://github.com/astropy/astropy/blob/a33a9e9e1ec0c4bd5548cf72ed4b982b253c4eed/astropy/modeling/core.py#L728 was a huge source of issues for simply initializing SummationModel with a large number of terms; however, if this copy is avoided then the initialization was way faster (~10min to simply initialize for 1000 terms vs ~1sec when avoided.

I think there are other deeper issues with memory is grown due to parameter mapping is done with compound models as @perrygreenfield has pointed out. In any case I think the handling of model parameters is worth further investigation.

WilliamJamieson avatar Mar 05 '25 20:03 WilliamJamieson