celerite2 icon indicating copy to clipboard operation
celerite2 copied to clipboard

gp does not own its data

Open pilchat opened this issue 3 years ago • 7 comments

Hi,

I am using emcee to maximize a likelihood function to which I pass a celerite2.GaussianProcess object. I store the chain in a h5 file. When I start the emcee sampling from scratch there is no issue, but if I restart the sampling loading the h5 file then the gp.compute() instruction throws this error:

File ".../python3.6/site-packages/celerite2/core.py", line 313, in compute self._t, self._diag, c=self._c, a=self._a, U=self._U, V=self._V File ".../python3.6/site-packages/celerite2/terms.py", line 157, in get_celerite_matrices c.resize(J, refcheck=False) ValueError: cannot resize this array: it does not own its data

How can I fix it? Thanks

pilchat avatar Sep 23 '22 10:09 pilchat

I expect this has more to do with how you're setting up the model when you restart - rather than anything to do with loading the chain. How are you saving the GP and log probability function between runs. Are you pickling it? See if you can put together a very simple example that reproduces this issue and we'll see what we can do.

dfm avatar Sep 23 '22 10:09 dfm

You find the example below. It's basically a copy/paste of the example in https://celerite2.readthedocs.io/en/latest/tutorials/first/, with the difference that the gp.compute() is inside the log_prob function, because in general i need to optimize a jitter term to add to yerr (I haven't implemented it in the example). I have also put the sampler in a pool.

The example raises the same exception as my code at the very first run of the sampling. It doesn't even need to load a previous h5 file. I don't understand why, yet it would be good to know what I'm doing wrong in the example.

import numpy as np
import celerite2
from celerite2 import terms
import emcee
from multiprocessing import Pool


np.random.seed(42)

t = np.sort(
    np.append(
        np.random.uniform(0, 3.8, 57),
        np.random.uniform(5.5, 10, 68),
    )
)  # The input coordinates must be sorted
yerr = np.random.uniform(0.08, 0.22, len(t))
y = (
    0.2 * (t - 5)
    + np.sin(3 * t + 0.1 * (t - 5) ** 2)
    + yerr * np.random.randn(len(t))
)

true_t = np.linspace(0, 10, 500)
true_y = 0.2 * (true_t - 5) + np.sin(3 * true_t + 0.1 * (true_t - 5) ** 2)



# Quasi-periodic term
term1 = terms.SHOTerm(sigma=1.0, rho=1.0, tau=10.0)

# Non-periodic component
term2 = terms.SHOTerm(sigma=1.0, rho=5.0, Q=0.25)
kernel = term1 + term2

# Setup the GP
gp = celerite2.GaussianProcess(kernel, mean=0.0)
# gp.compute(t, yerr=yerr)

prior_sigma = 2.0

freq = np.linspace(1.0 / 8, 1.0 / 0.3, 500)
omega = 2 * np.pi * freq

def set_params(params, gp):
    gp.mean = params[0]
    theta = np.exp(params[1:])
    gp.kernel = terms.SHOTerm(
        sigma=theta[0], rho=theta[1], tau=theta[2]
    ) + terms.SHOTerm(sigma=theta[3], rho=theta[4], Q=0.25)
    gp.compute(t, diag=yerr**2 + theta[5], quiet=True)
    return gp


def neg_log_like(params, gp):
    gp = set_params(params, gp)
    return -gp.log_likelihood(y)


def log_prob(params, gp,t,yerr):
    gp = set_params(params, gp)
    gp.compute(t, yerr=yerr)
    
    return (
        gp.log_likelihood(y) - 0.5 * np.sum((params / prior_sigma) ** 2),
        gp.kernel.get_psd(omega),
    )


np.random.seed(5693854)

solnx=np.array([ 4.65372782e-03, -3.41545599e-01,  7.00008984e-01,  1.94361891e+00, 6.05832782e-01,  3.75539832e+00, -7.87383431e+00])

coords = solnx + 1e-5 * np.random.randn(32, len(solnx))

backend = emcee.backends.HDFBackend('backend.h5')

with Pool(processes=4) as samplerPool:

    sampler = emcee.EnsembleSampler(
        coords.shape[0], coords.shape[1], log_prob, args=(gp,t,yerr),pool=samplerPool, backend=backend
    )
    
    state = sampler.run_mcmc(coords, 200, progress=False)

pilchat avatar Sep 23 '22 12:09 pilchat

I can look more closely later, but what if you use the globally defined gp instead of passing it as an argument? I can see a problem here where, even though each subprocess has its own instantiation of the GP, passing it as an argument is causing it to be pickled and unpickled which could cause problems (and definitely be slower!).

dfm avatar Sep 23 '22 13:09 dfm

I can confirm that my suggestion seems to fix things. This must be something to do with how the arrays are getting passed around by multiprocessing. We can leave this issue open for that, but it might be nice to set up a simpler example that doesn't depend on emcee directly. The issue is multiprocessing!

dfm avatar Sep 23 '22 15:09 dfm

can you show me your working example? i'm trying to use the gp as a global variable with no success (likely because i'm such a bad coder)

On Fri, Sep 23, 2022 at 5:02 PM Dan Foreman-Mackey @.***> wrote:

I can confirm that my suggestion seems to fix things. This must be something to do with how the arrays are getting passed around by multiprocessing. We can leave this issue open for that, but it might be nice to set up a simpler example that doesn't depend on emcee directly. The issue is multiprocessing!

— Reply to this email directly, view it on GitHub https://github.com/exoplanet-dev/celerite2/issues/70#issuecomment-1256330864, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACQ5BY2HB3GKVGUKXTLFOWDV7XA7XANCNFSM6AAAAAAQT3YL7A . You are receiving this because you authored the thread.Message ID: @.***>

pilchat avatar Sep 23 '22 15:09 pilchat

Sure thing! Here's how I edited your code: https://gist.github.com/dfm/b2b6cdfc61c4de8b5594e636e3c5c70f

dfm avatar Sep 23 '22 15:09 dfm

I implemented your solution in my code and it fixed the issue. Also, it is ~10% faster! Thanks twice!

As for this issue, you can close it if you prefer, and I give you the copyright to simplify and publish my example blaming it on multiprocessing. Unfortunately I wouldn't know how to simplify it.

Thanks again

pilchat avatar Sep 23 '22 16:09 pilchat