agents icon indicating copy to clipboard operation
agents copied to clipboard

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Results 174 agents issues
Sort by recently updated
recently updated
newest added

The docstring for `tf_py_environment.__getattr__` indicates that certain PyEnvironment methods might be incompatible with TF. ```python def __getattr__(self, name: Text) -> Any: """Enables access attributes of the wrapped PyEnvironment. Use with...

good first issue
contributions welcome

First of All, Thank you for this awesome repo, It saved months of my life ;-) I'm just letting you know that I'm following closely the ongoing implementation of the...

Hi, I have been adapting the DQN tutorial file for a.custom environment. It seems to learn fine, however I have an additional metric that I want to extract and plot...

Several Deep RL agents that are missing such as A2C, A3C which can be added. Further work could also adding MARL agents such as MAA2C or MADDPG

type:feature request

A feature request proposal to add support of Duel DQN, as suggested in [paper](https://arxiv.org/pdf/1511.06581.pdf) [Dueling Network Architectures for Deep Reinforcement Learning] , which is described as: " The main benefit...

Hello, I believe the tutorial 1_dqn_tutorial.ipynb has an unnecessary import of dynamic_step_driver. The module is not used at all, so the scripts runs just fine when commented out. Furthermore, I...

good first issue
contributions welcome

I was trying to load a saved model policy using the following code, but it ends with errors pasted below `saved_policy = tf.saved_model.load('policy_0') ` Software versions: - Tensorflow =2.3.0 -...

Hello, There is a tutorial on `Checkpointer`. https://github.com/tensorflow/agents/blob/master/docs/tutorials/10_checkpointer_policysaver_tutorial.ipynb In `Checkpointer` and `Restore checkpoint` sections, I found the following code. ``` train_checkpointer = common.Checkpointer( ckpt_dir=checkpoint_dir, max_to_keep=1, agent=agent, policy=agent.policy, replay_buffer=replay_buffer, global_step=global_step )...

The time_step_spec function only takes observation_spec and reward_spec array specifications, but if the reward_spec specifies a multidimensional array, shouldn't the discount_spec match its shape or at least accept an argument...

Suppose I save some policy: ` saver = PolicySaver(my_policy, batch_size=None) ` ` saver.save('my_policy') ` And then I load this policy ` policy = tf.saved_model.load('../DATA/Multi/policy_9') ` Now I _**ONLY**_ want the...