Jiayi Weng comments

Results 293 comments of


                                            Jiayi Weng

RNN for continuous CQL algorithm

> note that int(np.prod(state_shape)) = len*dim I don't think so. `state_shape` should always be a single frame, i.e., `int(np.prod(state_shape)) = dim`. If it's not the case, you should modify it...

RNN for continuous CQL algorithm

```python In [16]: m = nn.LSTM(input_size=3, hidden_size=10, num_layers=1, batch_first=True) In [17]: s = torch.zeros([64, 1, 3]) In [18]: ns, (h, c) = m(s) In [19]: ns.shape, h.shape, c.shape Out[19]: (torch.Size([64,...

RNN for continuous CQL algorithm

Should be `dim`. Let's take atari example: the observation space is (4, 84, 84) where 4 is `len`. However, when defining recurrent network, the state_shape should be `84*84` instead of...

RNN for continuous CQL algorithm

But here comes the problem: there are two ways to perform this kind of stack-obs: 1. gym.Env outputs single frame -- stack by buffer.sample(); 2. gym.Env outputs stacked frame by...

RNN for continuous CQL algorithm

Glad to hear that!

Some questions in recurrent-style SAC

Did you only change the network structure instead of other hyperparameters like lr (those would be sensitive to reward curve)? Honestly speaking I haven't used RNN+SAC to run experiments :(

Multiagent with different Action and State spaces

Maybe switch to something like: ```python observation_space: DictSpace(...) obs = {"player_a": np.array(...), "player_b": np.array(...)} # for all players # if only player_a, fill player_b with np.zeros_like(...) ```

Multiagent with different Action and State spaces

Yep... Basically what the MAPM does is to split the whole observation into several folds and send it to each policy. Also for the action: concat at the end, then...

A question: LSTM + PPO

https://tianshou.readthedocs.io/en/master/tutorials/cheatsheet.html#rnn-style-training https://github.com/thu-ml/tianshou/issues/486#issuecomment-1002665193

Does tianshou support RNN-SAC and how can I find the demo code?

Please see https://tianshou.readthedocs.io/en/master/tutorials/cheatsheet.html#rnn-style-training