probability icon indicating copy to clipboard operation
probability copied to clipboard

Memory leak when drawing many tfd.GaussianProcess samples?

Open tom-andersson opened this issue 3 years ago • 3 comments

When instantiating many 2-dimensional GPs with exponentiated quadratic kernels and sampling over several thousand points, I'm getting memory errors: ResourceExhaustedError: failed to allocate memory [Op:Mul]

I am able to produce a minimal working example in Google Colab: https://colab.research.google.com/drive/1yOzrWbyyia3zLirXQ6Z76scf20iryP4W#scrollTo=wNzD5JdxqgOX

Just ensure you have Runtime -> Change runtime type -> Hardware accelerator = GPU so that GPUs are used.

I provide the code here as well, for convenience:

import numpy as np
import tensorflow_probability as tfp
tfd = tfp.distributions
tfk = tfp.math.psd_kernels
from tqdm import tqdm

# This raises ResourceExhaustedError after 626 iterations
for i in tqdm(range(1000)):
    foo = tfd.GaussianProcess(
        kernel=tfk.ExponentiatedQuadratic(np.float64(1.), np.float64(1.)),
        index_points=np.random.randn(6_500, 2).astype(np.float64),
        observation_noise_variance=.05**2,
    ).sample(seed=i).numpy()

I thought I could avoid this error by reusing the same GaussianProcess object when drawing samples, but this also ended up raising a ResourceExhaustedError. I guess this suggests the issue is from running sample() many times, rather than instantiating the GP objects. Does this hint at a memory leak occurring?

tom-andersson avatar Dec 16 '21 19:12 tom-andersson

Hi, a couple of comments here:

Is the intent to draw a 1000 samples from a GP parameterized by a ExponentiatedQuadratic(1., 1.)?

If so I would create the GP object outside the loop. The GP object gets recreated each time in the loop, which means that the covariance matrix has to be recomputed each time which adds to memory costs. Making one GP outside the loop and calling gp.sample(seed=i) should suffice.

Note that you can also eliminate the loop if you just do gp.sample(1000, seed=23).

You will get back a Tensor of shape [1000, 6500]. Indexing in to the first dimension will give you independent samples back. This should drastically reduce memory and also make things much faster since the samples will be generated in a vectorized fashion.

srvasude avatar Mar 22 '22 06:03 srvasude

Hi @srvasude, thanks very much for the response. You might have missed in my OP I said I also had the error when reusing the same GaussianProcess object and only drawing samples in the loop.

However, using the method of passing 1000 to sample did the trick for me! I've updated the Google Colab MWE to demonstrate your solution. I have also kept the code block with the loop that triggers the ResourceExhausedError.

I'll leave this issue open because I still think it would be worth identifying the growing memory cost of running sample multiple times. But if a TensorFlow team member disagrees, feel free to close.

tom-andersson avatar Apr 01 '22 17:04 tom-andersson

@srvasude

"The GP object gets recreated each time in the loop, which means that the covariance matrix has to be recomputed each time which adds to memory costs. Making one GP outside the loop and calling gp.sample(seed=i) should suffice."

Just curious, why can't the previously computed GP objects be re-claimed by the garbage collectors, since it is now out of scope (was overwritten by the current "foo" instance)? Could this be improved by updating the class destructor somehow? Thank you.

ikarosilva avatar Apr 21 '22 13:04 ikarosilva