TextRL
TextRL copied to clipboard
Errors may occur after changing the batchsize and update interval of the agent
I follow the example:
https://voidful.dev/jupyter/2021/07/25/textrl-elon-musk.html
I wonder why batchsize is larger than update_ Interval, so I modify as follows:
before:
agent = actor.agent_ppo(update_interval=10, minibatch_size=2000, epochs=20)
after:
agent = actor.agent_ppo(update_interval=100, minibatch_size=10, epochs=20)
Then, the following error occurs:
ValueError: Expected parameter loc (Tensor of shape (10, 50257)) of distribution Normal(loc: torch.Size([10, 50257]), scale: torch.Size([10, 50257])) to satisfy the constraint Real(), but found invalid values: tensor([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', grad_fn=<AddmmBackward0>)
Hi Rongao
Sorry for late reply, It seems that there is some bug in https://github.com/voidful/TextRL/blob/30109590ec2fe395a2e04cca19a46f6895792b88/textrl/actor.py#L21
maybe you can change it into pfrl.policies.SoftmaxCategoricalHead
https://pfrl.readthedocs.io/en/latest/policies.html
I may need more time to further look into this issue, and the following result will be updated here. Thank you for the issue~
I will try later, Thank you for your suggestion. :)
pfrl.policies.SoftmaxCategoricalHead
did help me. Faster & Better.
Hey voidful! I am still studing your project——Textrl. I am wandering is it possible for to add batch decode when the actor interacting with environment?
------------------ 原始邮件 ------------------ 发件人: "voidful/TextRL" @.>; 发送时间: 2022年12月5日(星期一) 下午5:06 @.>; @.@.>; 主题: Re: [voidful/TextRL] Errors may occur after changing the batchsize and update interval of the agent (Issue #4)
Closed #4 as completed.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
environment
Hey voidful! I am still studing your project——Textrl. I am wandering is it possible for to add batch decode when the actor interacting with environment? …
It seems to be possible since PFRL do support batch training, i need to modify the environment a bit to support this feature. Hopes I can get back soon.