pyunfold icon indicating copy to clipboard operation
pyunfold copied to clipboard

MemoryError in unfolding for many bins

Open fzeiser opened this issue 6 years ago • 2 comments

I tried running the tutorial (Basic API Tutorial) with more bins. Above ~200 bins, I receive a MemoryError when I run iterative_unfold.

I tried another implementation of the iterative Bayesian unfolding algorithm in RooUnfold, which easily runs more then 1000s of bins -- of course taking some time. A difference there is that the systematic errors are not computed.

Now I see that creating a covariance matrix of type CovPP = np.zeros((cbins * ebins, cbins * ebins)) takes lots of memory. Still, in multidimensional problems, one quickly runs over 100 bins. So I wonder if one could turn off systematic error calculation with a keyword. A more advances version would be if one could (optionally) handle the covariance matrix through submatrices, eg fox algorithm or numpy.memmap to memory map a file on disk. This would slow down the calculations, but maybe better slow than impossible. Alternatively I have to look for a nice cluster with some hugemem coputer nodes.)

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-181-14949ba7442b> in <module>()
      8                                     efficiencies=efficiencies,
      9                                     efficiencies_err=efficiencies_err,
---> 10                                     callbacks=[Logger()])

/usr/local/lib/python3.5/dist-packages/pyunfold/unfold.py in iterative_unfold(data, data_err, response, response_err, efficiencies, efficiencies_err, prior, ts, ts_stopping, max_iter, cov_type, return_iterations, callbacks)
    160                               ts_func=ts_func,
    161                               max_iter=max_iter,
--> 162                               callbacks=callbacks)
    163 
    164     if return_iterations:

/usr/local/lib/python3.5/dist-packages/pyunfold/unfold.py in _unfold(prior, mixer, ts_func, max_iter, callbacks)
    205         status = {'unfolded': unfolded_n_c,
    206                   'stat_err': mixer.get_stat_err(),
--> 207                   'sys_err': mixer.get_MC_err(),
    208                   'num_iterations': iteration}
    209 

/usr/local/lib/python3.5/dist-packages/pyunfold/mix.py in get_MC_err(self)
     57         """MC (Systematic) Errors
     58         """
---> 59         cvm = self.cov.getVc1()
     60         err = np.sqrt(cvm.diagonal())
     61         return err

/usr/local/lib/python3.5/dist-packages/pyunfold/mix.py in getVc1(self)
    245         """
    246         # Get NObs covariance
--> 247         CovPP = self.getVcPP()
    248         # Get derivative
    249         dcdP = self.dcdP

/usr/local/lib/python3.5/dist-packages/pyunfold/mix.py in getVcPP(self)
    217         ebins = self.ebins
    218 
--> 219         CovPP = np.zeros((cbins * ebins, cbins * ebins))
    220 
    221         # Poisson covariance matrix

MemoryError: 

fzeiser avatar Sep 04 '18 06:09 fzeiser