acme
acme copied to clipboard
A library of reinforcement learning components and agents
... byte-compiling build/bdist.linux-aarch64/egg/acme/wrappers/single_precision.py to single_precision.cpython-39.pyc byte-compiling build/bdist.linux-aarch64/egg/acme/wrappers/single_precision_test.py to single_precision_test.cpython-39.pyc byte-compiling build/bdist.linux-aarch64/egg/acme/wrappers/step_limit.py to step_limit.cpython-39.pyc byte-compiling build/bdist.linux-aarch64/egg/acme/wrappers/video.py to video.cpython-39.pyc creating build/bdist.linux-aarch64/egg/EGG-INFO copying dm_acme.egg-info/PKG-INFO -> build/bdist.linux-aarch64/egg/EGG-INFO copying dm_acme.egg-info/SOURCES.txt -> build/bdist.linux-aarch64/egg/EGG-INFO copying dm_acme.egg-info/dependency_links.txt ->...
Hi, I am using your Tensorflow implementation of a DQN agent in a docker development environment. I have found that the `Snapshotter` object which creates the snapshot of the network...
I have four questions towards creating custom distributional RL algorithms using Acme. 1. The only distributional agent I can find (in TF and jax) is D4PG. Is this correct? 2....
Having read the [docs](https://github.com/deepmind/acme/blob/master/docs/components.md) and the code for the [episode adder](https://github.com/deepmind/acme/blob/master/acme/adders/reverb/episode.py#L51) I still don't quite understand it. Is it just the simplest adder? E.g. adding every transition to one long...
Hello, I am passing a custom gym environment to the DistributedD4PG. Sample code: ``` distributed_agent = DistributedD4PG(environment=train_environment, networks_dict=agent_networks, agent=agent_d3pg, obj_func = obj_func, critic_loss_type=critic, threshold=threshold, accelerator="GPU", num_actors=2, num_caches=0, environment_spec=environment_spec, batch_size=agent.batch_size, n_step=agent.n_step,...
Excuse me! Firstly, thanks for your working. This is fantastic work. But, you may need to update the tutorial notebook. Some examples are very old. The quick-start notebook is OK,...
[DM Meltingpot](https://github.com/deepmind/meltingpot) sets the action_spec as a tuple rather than list. This change is all that's needed to use Acme with Meltingpot - everything else works out of the box.
Fix the quickstart notebook so that it will work with version 4.1. - Remove the Acme extra requirement `reverb` which longer exists, and isn't needed (as `dm-reverb` is installed as...
In R2D2 Learner you sample learning trajectories from Reverb in such a format that at some index `t` you have observation `x_t`, action `a_t`, reward `r_t`, and recurrent state of...
Greetings, I met two problems with running MBOP baselines in https://github.com/deepmind/acme/blob/master/examples/offline/run_mbop_jax.py , and I'm looking for help. The first one comes from: https://github.com/deepmind/acme/blob/c7aac29c40183a191d9c39e66fd80deea9299977/examples/offline/run_mbop_jax.py#L25 There isn't a module called helpers under...