Seungyoun, Shin issues

Results 12 issues of


Seungyoun, Shin

cartpole ppo train , reward drop

if you train ppo far enough likes 3000 episodes or more, rewards got dropped. (like 500 to 30)

DALLE generating ugly image

This is result from `vae = DiscreteVAE( image_size = 128, num_layers = 2, # number of downsamples - ex. 256 / (2 ** 3) = (32 x 32 feature map)...

image data distribution

Thanks for your sharing codes I am trying to predict a low resolution image with pretrained ERFNet. Sadly, It doesn't work. I think my preprocessing of image data is different...

Sharing training log of 7B model on A6000 x 4

[Mar20_05-17-08_0c56f6779a08.csv](https://github.com/tatsu-lab/stanford_alpaca/files/11024692/Mar20_05-17-08_0c56f6779a08.csv) Training command ``` torchrun --nproc_per_node=4 --master_port=34322 train.py \ --model_name_or_path {your-hf-lamma-path} \ --data_path ./alpaca_data.json \ --bf16 True \ --output_dir {your-output-dir} \ --num_train_epochs 3 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \...

Multiagent is not valid in ProcThor Scene

``` import prior from ai2thor.controller import Controller dataset = prior.load_dataset("procthor-10k") house = dataset['train'][0] c = Controller(scene=house, height=640, width=640, fieldOfView=90, agentCount=2) #c.reset(agentCount=5) => not work #e = c.step(action="Initialize",raise_for_failure=True,**c.initialization_parameters,) => not work...

Mug filled with coffee disappear when teleporting

Teleporting with next node on the scene with (rotation, horizon) 43 True Name: isFilledWithLiquid, dtype: bool -45.0 0 43 True Name: isFilledWithLiquid, dtype: bool 135.0 0 43 True Name: isFilledWithLiquid,...

Sharing answer from the question "How many iterations do I need?"

**[Training Code]** ``` model = Unet( dim = 64, dim_mults = (1, 2, 4, 8) ).cuda() diffusion = GaussianDiffusion( model, image_size = 128, timesteps = 1000, # number of steps...

Language Processing (LP) component

We were reviewing the `main.py` script that loads `traj_data` from `alfred_data_all`, a subset of the data, and generates evaluation results for multiple modules, excluding the `BERT` module. However, upon submitting...

Integration of Self-Play Fine-Tuning (SPIN) Method for Enhancing Large Language Models

### 🚀 The feature, motivation, and pitch The recent paper, **"Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models," presents a novel method called Self-Play fIne-tuNing** (`SPIN`). This method...

feature request