TextRL Errors may occur after changing the batchsize and update interval of the agent

Errors may occur after changing the batchsize and update interval of the agent

Open rongaoli opened this issue 1 year ago • 3 comments

I follow the example: https://voidful.dev/jupyter/2021/07/25/textrl-elon-musk.html I wonder why batchsize is larger than update_ Interval, so I modify as follows: before: agent = actor.agent_ppo(update_interval=10, minibatch_size=2000, epochs=20) after: agent = actor.agent_ppo(update_interval=100, minibatch_size=10, epochs=20) Then, the following error occurs：

ValueError: Expected parameter loc (Tensor of shape (10, 50257)) of distribution Normal(loc: torch.Size([10, 50257]), scale: torch.Size([10, 50257])) to satisfy the constraint Real(), but found invalid values: tensor([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', grad_fn=<AddmmBackward0>)

Jul 22 '22 11:07 rongaoli

Hi Rongao

Sorry for late reply, It seems that there is some bug in https://github.com/voidful/TextRL/blob/30109590ec2fe395a2e04cca19a46f6895792b88/textrl/actor.py#L21

maybe you can change it into pfrl.policies.SoftmaxCategoricalHead https://pfrl.readthedocs.io/en/latest/policies.html

I may need more time to further look into this issue, and the following result will be updated here. Thank you for the issue~

Aug 01 '22 02:08 voidful

I will try later, Thank you for your suggestion. :)

Aug 01 '22 06:08 rongaoli

pfrl.policies.SoftmaxCategoricalHead did help me. Faster & Better.

Aug 01 '22 06:08 rongaoli

Hey voidful! I am still studing your project——Textrl. I am wandering is it possible for to add batch decode when the actor interacting with environment?

------------------ 原始邮件 ------------------ 发件人: "voidful/TextRL" @.>; 发送时间: 2022年12月5日(星期一) 下午5:06 @.>; @.@.>; 主题: Re: [voidful/TextRL] Errors may occur after changing the batchsize and update interval of the agent (Issue #4)

Closed #4 as completed.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Jan 04 '23 06:01 rongaoli

environment

Hey voidful! I am still studing your project——Textrl. I am wandering is it possible for to add batch decode when the actor interacting with environment? …

It seems to be possible since PFRL do support batch training, i need to modify the environment a bit to support this feature. Hopes I can get back soon.

Jan 09 '23 19:01 voidful

TextRL TextRL copied to clipboard

Errors may occur after changing the batchsize and update interval of the agent

TextRL
TextRL copied to clipboard