pymc icon indicating copy to clipboard operation
pymc copied to clipboard

"TypeError: cannot pickle 'fortran' object" for sampling with multiple cores

Open mathtimm opened this issue 3 years ago • 5 comments

I am getting a TypeError when I try to sample with multiple cores, the issue does not appear if cores=1. I am working on the modeling of exoplanet transits using the code exoplanet that makes use of PyMC3.

Complete error traceback
`---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [10], in <cell line: 3>()
      1 np.random.seed(42)
      3 with model:
----> 4     trace = pm.sample(
      5         tune=3000,
      6         draws=4000,
      7         start=opt,
      8         cores=3,
      9         chains=2,
     10         init="adapt_full",
     11         target_accept=0.9,
     12         return_inferencedata=True
     13     )

File /opt/homebrew/Caskroom/miniforge/base/envs/prose-env3/lib/python3.9/site-packages/pymc3/sampling.py:559, in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, pickle_backend, **kwargs)
    557 _print_step_hierarchy(step)
    558 try:
--> 559     trace = _mp_sample(**sample_args, **parallel_args)
    560 except pickle.PickleError:
    561     _log.warning("Could not pickle model, sampling singlethreaded.")

File /opt/homebrew/Caskroom/miniforge/base/envs/prose-env3/lib/python3.9/site-packages/pymc3/sampling.py:1461, in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, callback, discard_tuned_samples, mp_ctx, pickle_backend, **kwargs)
   1458         strace.setup(draws + tune, idx + chain)
   1459     traces.append(strace)
-> 1461 sampler = ps.ParallelSampler(
   1462     draws,
   1463     tune,
   1464     chains,
   1465     cores,
   1466     random_seed,
   1467     start,
   1468     step,
   1469     chain,
   1470     progressbar,
   1471     mp_ctx=mp_ctx,
   1472     pickle_backend=pickle_backend,
   1473 )
   1474 try:
   1475     try:

File /opt/homebrew/Caskroom/miniforge/base/envs/prose-env3/lib/python3.9/site-packages/pymc3/parallel_sampling.py:423, in ParallelSampler.__init__(self, draws, tune, chains, cores, seeds, start_points, step_method, start_chain_num, progressbar, mp_ctx, pickle_backend)
    421 if mp_ctx.get_start_method() != "fork":
    422     if pickle_backend == "pickle":
--> 423         step_method_pickled = pickle.dumps(step_method, protocol=-1)
    424     elif pickle_backend == "dill":
    425         try:

TypeError: cannot pickle 'fortran' object`
Could this be a version issue ? I am having trouble finding the right combination of pymc3, pymc3-ext and exoplanet.

Thanks a lot in advance !

Versions and main components

  • PyMC3 Version: 3.11.4
  • pymc3-ext Version: 0.1.1
  • aesara-theano-fallback Version: 0.1.0
  • theano-pymc Version: 1.1.2
  • exoplanet Version: 0.5.2
  • Python Version: 3.9.13
  • Operating system: MacOS Monterey (12.2.1), M1 chip
  • How did you install PyMC3: pip

mathtimm avatar Jun 07 '22 17:06 mathtimm

By any chance, could you share your model code with us? We've had reports similar to this on Mac M1 chips, but haven't been able to reproduce the error.

lucianopaz avatar Jun 07 '22 20:06 lucianopaz

Yes of course, here it is. It's for 13 distinct light curves obtained with various instruments and in multiple filters.

` with pm.Model() as model:

# Stellar parameters
# -----------------
r_s = pm.Normal('r_s',0.326568  ,0.00988657)
m_s = pm.Normal('m_s',0.30713 ,0.0202867)


# Orbital parameters
# -----------------
t0= pm.Normal('t0',2459384.5731,0.005)
p = pm.Normal('P',16.336545,0.001) 
b = pm.Uniform("b",0.,1.,testval=0.4)

# Keplerian orbit
# ---------------
orbit = xo.orbits.KeplerianOrbit(period=p, t0=t0, r_star=r_s, b=b, m_star=m_s)
pm.Deterministic("a",orbit.a)
pm.Deterministic ('i',orbit.incl * 180/np.pi)
pm.Deterministic('a/r_s', orbit.a / orbit.r_star)

depths =[]
names=[]

# Loop over the instruments
# -------------------------
#Loop over the filters
for i, (filt,ldc) in enumerate(filters.items()):

    # The limb darkening
    u = xo.QuadLimbDark(f"u_{filt}",testval=np.array(ldc))
    star = xo.LimbDarkLightCurve(u)
    
    # Radius 
    depth = pm.Uniform(f"depth_{filt}", 1e-3, 9e-3,testval=4e-3) 
    ror = pm.Deterministic(f"ror_{filt}",star.get_ror_from_approx_transit_depth(depth, b))
    r_p = pm.Deterministic(f"r_p_{filt}", ror * r_s) # In solar radius 
    r = pm.Deterministic(f'r_{filt}',r_p*1/earth2sun)
    
    #Loop over all the light curves
    for n, (name, (x, y, yerr, texp, X,f)) in enumerate(data[filt].items()):
        names.append(name)
        if filt == 'zp':
            alpha = pm.Normal(f'alpha_{name}',0.86,0.001)
            y_p = y*(1+alpha) - alpha
        else:
            y_p=y

        # starry light-curve
        light_curves = star.get_light_curve(orbit=orbit, r=r_p, t=x, texp=texp/24/60/60) 
        transit = pm.Deterministic(f"transit_{name}", pm.math.sum(light_curves, axis=-1))

        # Systematics and final model
        c=np.linalg.lstsq(X.T,y,rcond=None)[0]
        w = pm.Flat(f'w_{name}',shape=len(X),testval=np.array(c))
        systematics = pm.Deterministic(f'systematics_{name}',[email protected]())
        mu = pm.Deterministic(f"mu_{name}", transit+systematics)

        # Likelihood function
        sigma = pm.Uniform(f"sigma_{name}",0.5,4,testval=f) # Adjustment of the errors
        pm.Normal(f"obs_{name}", mu=mu, sd=yerr*sigma, observed=y_p)

# Maximum a posteriori
# --------------------
opt = pmx.optimize(start=model.test_point)`

mathtimm avatar Jun 08 '22 07:06 mathtimm

I created a new clean environment and installed exoplanet and exoplanet-ext with pip, the problem seems fixed now. The only difference in terms of version is that I am running the 3.11.5 of PyMC3, but I tried upgrading the package in the other environment and the bug still happens. I'm not sure what is happening but it might not be related to PyMC3 but rather to exoplanet and the compatibility with PyMC3 and Theano-PyMC ?

mathtimm avatar Jun 08 '22 10:06 mathtimm

From the other reports that we had, it seems related to BLAS, scipy and the cholesky decomposition. We haven’t identified the exact configuration options that trigger the error though, but we think that at its core, what is failing is related to this:

from scipy.linalg import lapack
import cloudpickle


cloudpickle.dumps(lapack.cpotrf)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [39], in <cell line: 5>()
      1 from scipy.linalg import lapack
      2 import cloudpickle
----> 5 cloudpickle.dumps(lapack.cpotrf)

File ~/anaconda3/envs/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle_fast.py:73, in dumps(obj, protocol, buffer_callback)
     69 with io.BytesIO() as file:
     70     cp = CloudPickler(
     71         file, protocol=protocol, buffer_callback=buffer_callback
     72     )
---> 73     cp.dump(obj)
     74     return file.getvalue()

File ~/anaconda3/envs/test_env/lib/python3.10/site-packages/cloudpickle/cloudpickle_fast.py:633, in CloudPickler.dump(self, obj)
    631 def dump(self, obj):
    632     try:
--> 633         return Pickler.dump(self, obj)
    634     except RuntimeError as e:
    635         if "recursion" in e.args[0]:

TypeError: cannot pickle 'fortran' object

lucianopaz avatar Jun 08 '22 17:06 lucianopaz

It looks like this problem might be solved once numpy/numpy#21767 is fixed.

lucianopaz avatar Jun 23 '22 13:06 lucianopaz