rlpyt icon indicating copy to clipboard operation
rlpyt copied to clipboard

Asynchronous runners with CPU only?

Open definitelyuncertain opened this issue 3 years ago • 2 comments

Hi, I'm trying to run DQN with asynchronous sampling using rlpyt's async sampler and runner classes. However, it looks like they don't work with CPU only, and require the presence of a GPU. Here's my code based on the examples and docs to try and use only CPU:

from rlpyt.samplers.serial.sampler import SerialSampler
from rlpyt.samplers.async_.serial_sampler import AsyncSerialSampler
from rlpyt.algos.dqn.dqn import DQN
from rlpyt.agents.dqn.atari.atari_dqn_agent import AtariDqnAgent
from rlpyt.envs.atari.atari_env import AtariEnv, AtariTrajInfo
from rlpyt.runners.minibatch_rl import MinibatchRlEval
from rlpyt.runners.async_rl import AsyncRlEval
from rlpyt.samplers.async_.cpu_sampler import AsyncCpuSampler
from rlpyt.samplers.async_.gpu_sampler import AsyncGpuSampler
from rlpyt.utils.logging.context import logger_context
from rlpyt.envs.gym import make as gym_make
from rlpyt.utils.launching.affinity import make_affinity, encode_affinity, build_async_affinity

def build_and_train(game="pong", run_ID=0):
    config = dict(
        algo=dict(batch_size=32,
                  min_steps_learn=500,
                  double_dqn=True,
                  prioritized_replay=True),
        sampler=dict(batch_T=1, batch_B=32),
    )
    sampler = AsyncCpuSampler(
        EnvCls=AtariEnv,
        TrajInfoCls=AtariTrajInfo,
        env_kwargs=dict(game=game),
        eval_env_kwargs=dict(game=game),
        max_decorrelation_steps=20,
        eval_n_envs=4,
        eval_max_steps=int(2000),
        eval_max_trajectories=10,
        **config["sampler"],
    )
    algo = DQN(**config["algo"])  # Run with defaults.
    agent = AtariDqnAgent()
    affinity = make_affinity(
        run_slot=0,
        n_cpu_core=12,
        n_gpu=0,
        hyperthread_offset=6,
        n_socket=1,
        async_sample=True,
    )
    runner = AsyncRlEval(
        algo=algo,
        agent=agent,
        sampler=sampler,
        n_steps=int(1e5),
        log_interval_steps=500,
        affinity=affinity,
    )
    name = "dqn_" + game
    log_dir = "logs/dqn_test_"
    with logger_context(log_dir, run_ID, name, config):
        runner.train()

build_and_train()

The error I get is:

Traceback (most recent call last):
  File "test_rlpyt_async.py", line 56, in <module>
    build_and_train()
  File "test_rlpyt_async.py", line 35, in build_and_train
    affinity = make_affinity(
  File ".../rlpyt/utils/launching/affinity.py", line 165, in make_affinity
    return affinity_from_code(encode_affinity(run_slot=run_slot, **kwargs))
  File ".../rlpyt/utils/launching/affinity.py", line 160, in affinity_from_code
    return build_cpu_affinity(run_slot, **aff_params)
TypeError: build_cpu_affinity() got an unexpected keyword argument 'ass'

On the other hand, it works when I change n_gpu to 1. Any idea what could have gone wrong here?

definitelyuncertain avatar Nov 02 '20 23:11 definitelyuncertain

I believe async_sample=True corresponds to asynchronous sampling and optimization. If you just want asynchronous sampling, you can use one of the parallel CPU sampler classes without setting async_sample=True in affinity

ankeshanand avatar Nov 27 '20 22:11 ankeshanand

What I'm looking for when I say asynchronous sampling is to have the interactions with the environment happen independently from the DQN learning steps. Looks like I might have mistyped and what I would like is in fact asynchronous sampling and optimization.

My understanding then, is that the parallel CPU sampler doesn't do this, rather it runs several sampling processes at once, but nevertheless still synchronous with the optimization steps. Is that not the case?

definitelyuncertain avatar Nov 28 '20 01:11 definitelyuncertain