stable-baselines3 [Question] found class mlplstmpolicy in the stablebaseline, but not in Sb3. Why

Important Note: We do not do technical support, nor consulting and don't answer personal questions per email. Please post your question on the RL Discord, Reddit or Stack Overflow in that case.

Question

I found class mlplstmpolicy in the stablebaseline, but not in Sb3. Why Can I copy class mlplstmpolicy into Sb3 for use? Thank you for your help.

class MlpLstmPolicy(LstmPolicy):
    """
    Policy object that implements actor critic, using LSTMs with a MLP feature extraction

    :param sess: (TensorFlow session) The current TensorFlow session
    :param ob_space: (Gym Space) The observation space of the environment
    :param ac_space: (Gym Space) The action space of the environment
    :param n_env: (int) The number of environments to run
    :param n_steps: (int) The number of steps to run for each environment
    :param n_batch: (int) The number of batch to run (n_envs * n_steps)
    :param n_lstm: (int) The number of LSTM cells (for recurrent policies)
    :param reuse: (bool) If the policy is reusable or not
    :param kwargs: (dict) Extra keyword arguments for the nature CNN feature extraction
    """

    def __init__(self, sess, ob_space, ac_space, n_env, n_steps, n_batch, n_lstm=256, reuse=False, **_kwargs):
        super(MlpLstmPolicy, self).__init__(sess, ob_space, ac_space, n_env, n_steps, n_batch, n_lstm, reuse,
                                            layer_norm=False, feature_extraction="mlp", **_kwargs)

Or, the LSTM of PPO can be used. such as: sb3_contrib.ppo_recurrent.MlpLstmPolicy

Your question. This can be e.g. questions regarding confusing or unclear behaviour of functions or a question if X can be done using stable-baselines3. Make sure to check out the documentation first.

Additional context

Add any other context about the question here.

Checklist

[ ] I have read the documentation (required)
[ ] I have checked that there is no similar issue in the repo (required)

Jul 31 '22 03:07 Zero1366166516

Hey. No, you can not copy the SB2 code over, but yes, you can use the implementation in stable-baselines3-contrib.

Jul 31 '22 11:07 Miffyli

For the reason why, you can read the migration guide: https://stable-baselines3.readthedocs.io/en/master/guide/migration.html

Also, each algorithm page explains which spaces and policies are supported, for instance, for ppo: https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html

Jul 31 '22 13:07 araffin

Copy that. In fact, I just want to find an example program of LSTM feature extractor.

Aug 01 '22 02:08 Zero1366166516

I just want to find an example program of LSTM feature extractor.

You can take a look at SB3 code for that: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/sb3_contrib/common/recurrent/policies.py#L22

Disclaimer: using LSTM with RL is far from trivial and requires good understanding of the algorithm and good programming skills, so you will need to study carefully the code and the theory.

Aug 01 '22 10:08 araffin