rlpyt icon indicating copy to clipboard operation
rlpyt copied to clipboard

Error on running GpuSampler/CpuSampler

Open nazarblch opened this issue 5 years ago • 4 comments

My program fails with the following error when I try to use GpuSampler or CpuSampler (SerialSampler works normally)

XIO: fatal IO error 11 (Resource temporarily unavailable) on X server "localhost:11.0" after 104 requests (104 known processed) with 0 events remaining.

sampler = GpuSampler( EnvCls=factory_method, TrajInfoCls=TrajInfo, env_kwargs=dict(name=game), eval_env_kwargs=dict(name=game), batch_T=10, batch_B=10, max_decorrelation_steps=0, eval_n_envs=10, eval_max_steps=int(10e3), eval_max_trajectories=5, )

runner_cls = MinibatchRlEval if eval else MinibatchRl runner = runner_cls( algo=algo, agent=agent, sampler=sampler, n_steps=5e6, log_interval_steps=1e3, affinity=dict(cuda_idx=cuda_idx, workers_cpus=list(range(20))), )

The code fragment is taken from https://github.com/juliusfrost/dreamer-pytorch/blob/master/main_dmc.py

nazarblch avatar Jul 23 '20 12:07 nazarblch

Hmm, I'm not familiar with that error message. Are the worker processes even able to fork?

astooke avatar Aug 05 '20 18:08 astooke

@nazarblch I am struggling with this problem. Any solutions ? @astooke

csingh27 avatar May 05 '21 19:05 csingh27

I had not solved this problem that time. Samplers actually do not affect the speed too much in algorithms with buffers.

nazarblch avatar May 06 '21 22:05 nazarblch

if your problem relates to dreamer algorithm consider also the other implementations https://github.com/yusukeurakami/dreamer-pytorch https://github.com/ray-project/ray/blob/master/rllib/agents/dreamer/dreamer.py

nazarblch avatar May 06 '21 22:05 nazarblch