emcee icon indicating copy to clipboard operation
emcee copied to clipboard

OSError: [Errno 24] Too many open files

Open Toma-Bad opened this issue 5 years ago • 2 comments

General information:

  • emcee version:2.2.1
  • platform:Ubuntu 18.04
  • installation method (pip/conda/source/other?):pip

Problem description: I'm running a loop of models that i want to fit with emcee. There are 64 models to be fit.

Expected behavior: I expect to run the loop, collect the results.

Actual behavior: After 49 steps i get the error:

/usr/lib/python3/dist-packages/emcee/ensemble.py in __init__(self, nwalkers, dim, lnpostfn, a, args, kwargs, postargs, threads, pool, live_dangerously, runtime_sortingfn)
    105 
    106         if self.threads > 1 and self.pool is None:
--> 107             self.pool = InterruptiblePool(self.threads)
    108 
    109     def clear_blobs(self):

/usr/lib/python3/dist-packages/emcee/interruptible_pool.py in __init__(self, processes, initializer, initargs, **kwargs)
     72         new_initializer = functools.partial(_initializer_wrapper, initializer)
     73         super(InterruptiblePool, self).__init__(processes, new_initializer,
---> 74                                                 initargs, **kwargs)
     75 
     76     def map(self, func, iterable, chunksize=None):

/usr/lib/python3.6/multiprocessing/pool.py in __init__(self, processes, initializer, initargs, maxtasksperchild, context)
    172         self._processes = processes
    173         self._pool = []
--> 174         self._repopulate_pool()
    175 
    176         self._worker_handler = threading.Thread(

/usr/lib/python3.6/multiprocessing/pool.py in _repopulate_pool(self)
    237             w.name = w.name.replace('Process', 'PoolWorker')
    238             w.daemon = True
--> 239             w.start()
    240             util.debug('added worker')
    241 

/usr/lib/python3.6/multiprocessing/process.py in start(self)
    103                'daemonic processes are not allowed to have children'
    104         _cleanup()
--> 105         self._popen = self._Popen(self)
    106         self._sentinel = self._popen.sentinel
    107         # Avoid a refcycle if the target function holds an indirect

/usr/lib/python3.6/multiprocessing/context.py in _Popen(process_obj)
    275         def _Popen(process_obj):
    276             from .popen_fork import Popen
--> 277             return Popen(process_obj)
    278 
    279     class SpawnProcess(process.BaseProcess):

/usr/lib/python3.6/multiprocessing/popen_fork.py in __init__(self, process_obj)
     17         util._flush_std_streams()
     18         self.returncode = None
---> 19         self._launch(process_obj)
     20 
     21     def duplicate_for_child(self, fd):

/usr/lib/python3.6/multiprocessing/popen_fork.py in _launch(self, process_obj)
     63     def _launch(self, process_obj):
     64         code = 1
---> 65         parent_r, child_w = os.pipe()
     66         self.pid = os.fork()
     67         if self.pid == 0:

OSError: [Errno 24] Too many open files

What have you tried so far?: Limiting the number of threads in the ensemblesampler.

Minimal example:

import emcee
for something in somethingelse:
	def lnlike(theta,x,y,yerr):
		a1,a2,a3 = theta
		g1 = G1 
		g2 = G2 
		g3 = G3
		model =  a1*x**g1+a2*x**g2+a3*x**g3 
		inv_sigma2 = 1.0/yerr**2 
		return -0.5*(np.sum((y-model)**2*inv_sigma2 - np.log(inv_sigma2)))
			
	def lnprior(theta):
		a1,a2,a3 = theta
		if 6 > a1 > 0 and 2 > a2 > 0 and 0.0003 > a3 > 0:
			return 0.0
		else:
			return -np.inf
	def lnprob(theta, x, y, yerr):
	    lp = lnprior(theta)
	    if not np.isfinite(lp):
		        return -np.inf
	    return lp + lnlike(theta, x, y, yerr)
			
	ndim, nwalkers = 3, 30
	pos = [np.array([np.random.uniform(low = 0,high = 6),np.random.uniform(low = 0,high = 2),np.random.uniform(low = 0,high = 0.0003)]) for i in range(nwalkers)]
        #it crashes here after 49 steps
	sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob, args=(x, y, yerr),threads=8)
	sampler.run_mcmc(pos, 1400)
	samples = sampler.chain[:, 400:, :].reshape((-1, ndim))
	p1,p2,p3 = map(lambda v: (v[1],v[2]-v[1],v[1]-v[0]),zip(*np.percentile(samples,[16,50,84],axis=0)))

Toma-Bad avatar Jul 31 '20 15:07 Toma-Bad

Instead of using the threads argument, you should try the pool interface: docs. The way you have it written, the multiprocessing pool is never being closed so you end up with hundreds of running Python processes!

dfm avatar Aug 01 '20 11:08 dfm

I changed my ulimit to 4096 from 1024 and it worked. Following is the procedure:

Check your num of descriptors limit using:

ulimit -n

For me it was 1024, and I updated it to 4096 and it worked.

ulimit -n 4096

anilsh avatar Nov 29 '21 06:11 anilsh