ChenDRAG comments

Results 18 comments of


                                            ChenDRAG

trafficstars

Implement Decision Transformer for offline RL

I have a little suggestion. Maybe we shouldn't put decision transformer network architecture into net/common.py because this code is used by no other algorithms but dt. Transformers is still rarely...

support of sampling episode

Collecting by episode is supported but sampling is not directly supported(may stack method in buffer can fullfill your need? ).

Cannot run examples/diambraArenaGist.py

Okay, no problem. Thanks. It seems that this is because this line of code https://github.com/diambra/diambraArena/blob/472913f94a466ca59e423363742f675713f74da2/diambraArena/diambraEnvLib/libInterface.py#L24 point to a file that doesn't exist. I tried to change libdiambraEnv to libdiambraEnv18, but...

Bootstrapping the value function?

This is done in Tianshou, check it out at https://github.com/thu-ml/tianshou If you are still interested. @XuchanBao

Pytorch

For, pytorch. Why not try tianshou. https://github.com/thu-ml/tianshou We have recently 9 classic model free alogirthm on Mojoco. Results are better than reported ones here in baselines

Understanding normalization of advantage function

https://arxiv.org/pdf/2006.05990.pdf concludes that "per-minibatch advantage normalization (C67) seems not to affect the performance too much (Fig. 35)"

In PPOPolicy, the ratio is computed with requires_grad `True`.

@imerme I don't really get your point. How do you implement the action mask? Could you give more detail about your coding and more insights on why computed ratio with...

weird results runing evaluate_parsing_JPPNet-s2.py

I use picture of size 1080*1080 does picture size affect your results? or you resize the picture automatically so it doesn't matter?

Suggestion - implement some "tricks" that improve performance

@henrycharlesworth I have tried a number of suggestions proposed in the paper you mentioned (ablation studies suggest some of them are useful, some are temporarily not) and implement "recompute advantage"...

ImportError: cannot import name 'get_full_repo_name' from 'huggingface_hub'

Have you resolve your issue? @Ryan-shadow