agents issues

Results 172 agents issues

Sort by recently updated

Does TF-Agents not support XLA?

I have built a dynamic step driver but cannot seem to get it to work with jit_compile=True. ``` driver = Driver() # Setup driver # driver.run = tfa_common.function(driver.run, jit_compile=True) ```...

connor-create

Add ability to pass multiple inputs to a single preprocessing layer in EncodingNetwork

Currently, there is no way to pass multiple input tensors to an individual preprocessing layer. This isn't necessarily a large problem, but for some niche use cases, it can be...

boomanaiden154

[PPO] Different trajectory structures in a data collection test of a custom environment

I am building a PPO agent side by side with the [TF-Agents DQN tutorial](https://colab.research.google.com/github/tensorflow/agents/blob/master/docs/tutorials/1_dqn_tutorial.ipynb#scrollTo=wr1KSAEGG4h9). The idea was checking the basics structures needed for a simple tf-agent to work, and adapting...

HWerneck

PPO Agent with masked actions

Hello, when I'm using PPOAgent with masked actions, I wrap the actor network with `MaskSplitterNetwork` and the value network as well. However, when trying to train the agent, I get...

beytuuh42

Multiple actions for PPOAgent

Hi, I developed a environment with action_spec as : BoundedTensorSpec(shape=(2,), dtype=tf.int32, name='action', minimum=array(0, dtype=int32), maximum=array(65535, dtype=int32)) Since the two actions are independent, to obtain the action, I use tfp.Independent to...

DavyMorgan

ValueError: DistributedVariable.handle is not available outside the replica context or a `tf.distribute.Strategy.update()` call.

Hi, I tried to implement a DDPG based Actor critic framework using MirroredStrategy,. Without MirroredStrategy code runs perfectly fine. The actor network gets created fine but error appears on tf.gradients(...)...

arslanqadeer

Help with tf-agents ranking example

Any guidance on tf-agents ranking examples? I am looking at the train_eval_ranking file, but there isn't much documentation on how the ranking environment works, what data is needed, etc.

AndrewR08

How to use the replay buffer in tf_agents for contextual bandit, that predicts and trains on a daily basis

I am using the tf_Agents library for contextual bandits usecase. In this usecase predictions (daily range between 20k and 30k predictions, 1 for each user) are made daily (multiple times...

tejavenkatk

bandits

Hyperlink to V2 examples

Updated reference to `TF2.x` end-to-end examples.

chunduriv

Fixing broken link in `8_networks_tutorial.ipynb`

Updated the broken link for Networks [in this section](https://www.tensorflow.org/agents/tutorials/8_networks_tutorial#network_api).

chunduriv

agents
agents copied to clipboard

Metadata

Does TF-Agents not support XLA?

Add ability to pass multiple inputs to a single preprocessing layer in EncodingNetwork

[PPO] Different trajectory structures in a data collection test of a custom environment

PPO Agent with masked actions

Multiple actions for PPOAgent

ValueError: DistributedVariable.handle is not available outside the replica context or a `tf.distribute.Strategy.update()` call.

Help with tf-agents ranking example

How to use the replay buffer in tf_agents for contextual bandit, that predicts and trains on a daily basis

Hyperlink to V2 examples

Fixing broken link in `8_networks_tutorial.ipynb`

← Metadata

Owner

Metadata

agents agents copied to clipboard

Metadata

← Metadata

Owner

Metadata

agents
agents copied to clipboard