Kuang-Huei Lee comments

Results 9 comments of


Kuang-Huei Lee

Feature Request: support Priority Replay buffer with reverb in a parallel environment

Do you mean you run into an error? If so, can you provide error logs for more context?

Feature Request: entropy_regularization Schedule or Function

Yes, I think this makes sense. Please make a pull request if you are interested.

About use clothing1M data

Can you provide the results you get on both datasets? Thanks.

Food-101N Dataset meta file contains wrong information

It seems like this is the only mismatch between meta data and actual file path. I am investigating why this happened and will update the dataset once I conclude. Thanks.

Was not able to exactly reproduce the numbers from the paper

Hi, many people were able to reproduce similar results, either in their published works, feedback in this repo, or privately informed me. I myself also cloned the code from this...

ActorDistributionNetwork with bounded array_specs

What NormalProjectionNetwork does is squashing actions with tanh and it shouldn't go out of the bounds if `scale_distribution=False`. I am not very sure why this can happen. Would you like...

ActorDistributionNetwork with bounded array_specs

PPO does not respect action boundaries: https://github.com/openai/baselines/issues/121. Environment is expected to clip action values. DDPG/D4PG clips action values in its policy. SAC nicely handles this with a tanh squashed action...

ActorDistributionNetwork with bounded array_specs

So TF-Agents ddpg does clipping in policy: https://github.com/tensorflow/agents/blob/master/tf_agents/agents/ddpg/ddpg_agent.py#L166 If you are using ddpg, you should be good. If you are using TF-Agents PPO, you should use the ActionClipWrapper that @oars...