Xinyang (Young) Geng
Xinyang (Young) Geng
Use the following hyperaparameters for Antmaze: ``` python -m SimpleSAC.conservative_sac_main \ --env 'antmaze-medium-diverse-v2' \ --cql.cql_min_q_weight=5.0 \ --cql.cql_max_target_backup=True \ --cql.cql_target_action_gap=0.2 \ --orthogonal_init=True \ --cql.cql_lagrange=True \ --cql.cql_temp=1.0 \ --cql.policy_lr=1e-4 \ --cql.qf_lr=3e-4 \...
I also realized that this might be an issue. If we want to resample noise we should use either explicitly pass in a new rng every time or use self.make_rng...
Now I see that it passes in a new RNG key every time so I believe I was wrong about the noise not being resampled and the implementation should be...
Yeah the doc is a bit outdated due to a lot of changes I made recently. I will try to update it more frequently.
I can contribute as well for the Jax implementation! Also I'm not sure if we can just use their pytorch code, since it is released under GPLv3 instead of the...
Indeed this would be useful. Let me look into that.
The model architecture and size is the same as LLaMA 7B, so if the original LLaMA 7B fits, our model should fit as well. If not, we are also working...
We've just released a checkpoint for our 3B model and that should definitely fit in colab.
Thanks for the suggestions! At the moment, we want to stick to the original LLaMA configurations as much as possible and don't have the resources to retrain our model with...
Thanks for your interest in our project. Currently we don't have the time to support a community as we are only two students doing this project part time. For information...