Xinyang (Young) Geng comments

Results 52 comments of


                                            Xinyang (Young) Geng

Antmaze results

Use the following hyperaparameters for Antmaze: ``` python -m SimpleSAC.conservative_sac_main \ --env 'antmaze-medium-diverse-v2' \ --cql.cql_min_q_weight=5.0 \ --cql.cql_max_target_backup=True \ --cql.cql_target_action_gap=0.2 \ --orthogonal_init=True \ --cql.cql_lagrange=True \ --cql.cql_temp=1.0 \ --cql.policy_lr=1e-4 \ --cql.qf_lr=3e-4 \...

NoisyNets implementation issues

I also realized that this might be an issue. If we want to resample noise we should use either explicitly pass in a new rng every time or use self.make_rng...

NoisyNets implementation issues

Now I see that it passes in a new RNG key every time so I believe I was wrong about the noise not being resampled and the implementation should be...

[Doc error]: Outdated doc for LLAMA

Yeah the doc is a bit outdated due to a lot of changes I made recently. I will try to update it more frequently.

LLaMA

I can contribute as well for the Jax implementation! Also I'm not sure if we can just use their pytorch code, since it is released under GPLv3 instead of the...

LLaMA 2 support for pre-training

Indeed this would be useful. Let me look into that.

Thanks! Does it fit in Free Colab?

The model architecture and size is the same as LLaMA 7B, so if the original LLaMA 7B fits, our model should fit as well. If not, we are also working...

Thanks! Does it fit in Free Colab?

We've just released a checkpoint for our 3B model and that should definitely fit in colab.

Add larger-than-character-level subword vocab for non-latin languages?

Thanks for the suggestions! At the moment, we want to stick to the original LLaMA configurations as much as possible and don't have the resources to retrain our model with...

How to track Training / Launch timeline and other information

Thanks for your interest in our project. Currently we don't have the time to support a community as we are only two students doing this project part time. For information...