TextRL icon indicating copy to clipboard operation
TextRL copied to clipboard

Errors may occur after changing the batchsize and update interval of the agent

Open rongaoli opened this issue 1 year ago • 3 comments

I follow the example: https://voidful.dev/jupyter/2021/07/25/textrl-elon-musk.html I wonder why batchsize is larger than update_ Interval, so I modify as follows: before: agent = actor.agent_ppo(update_interval=10, minibatch_size=2000, epochs=20) after: agent = actor.agent_ppo(update_interval=100, minibatch_size=10, epochs=20) Then, the following error occurs:

ValueError: Expected parameter loc (Tensor of shape (10, 50257)) of distribution Normal(loc: torch.Size([10, 50257]), scale: torch.Size([10, 50257])) to satisfy the constraint Real(), but found invalid values: tensor([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], device='cuda:0', grad_fn=<AddmmBackward0>)

rongaoli avatar Jul 22 '22 11:07 rongaoli

Hi Rongao

Sorry for late reply, It seems that there is some bug in https://github.com/voidful/TextRL/blob/30109590ec2fe395a2e04cca19a46f6895792b88/textrl/actor.py#L21

maybe you can change it into pfrl.policies.SoftmaxCategoricalHead https://pfrl.readthedocs.io/en/latest/policies.html

I may need more time to further look into this issue, and the following result will be updated here. Thank you for the issue~

voidful avatar Aug 01 '22 02:08 voidful

I will try later, Thank you for your suggestion. :)

rongaoli avatar Aug 01 '22 06:08 rongaoli

pfrl.policies.SoftmaxCategoricalHead did help me. Faster & Better.

rongaoli avatar Aug 01 '22 06:08 rongaoli

Hey voidful! I am still studing your project——Textrl. I am wandering is it possible for to add batch decode when the actor interacting with environment? 

------------------ 原始邮件 ------------------ 发件人: "voidful/TextRL" @.>; 发送时间: 2022年12月5日(星期一) 下午5:06 @.>; @.@.>; 主题: Re: [voidful/TextRL] Errors may occur after changing the batchsize and update interval of the agent (Issue #4)

Closed #4 as completed.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

rongaoli avatar Jan 04 '23 06:01 rongaoli

environment

Hey voidful! I am still studing your project——Textrl. I am wandering is it possible for to add batch decode when the actor interacting with environment? 

It seems to be possible since PFRL do support batch training, i need to modify the environment a bit to support this feature. Hopes I can get back soon.

voidful avatar Jan 09 '23 19:01 voidful