video-gcp icon indicating copy to clipboard operation
video-gcp copied to clipboard

Config for adaptive binding

Open lim142857 opened this issue 2 years ago • 5 comments

Hey, it's quite solid and promising work! I am wondering if you could release the temp config file for adaptive binding, maybe for 9 rooms? Besides, I am trying to reproduce the Bottleneck Discovery Experiment(like figure 6 In the paper), do we have a template script for it? Thanks!

lim142857 avatar Aug 19 '22 16:08 lim142857

Hey, the configs are actually released! You can change the base config for this one: https://github.com/orybkin/video-gcp/blob/4608a543fe60c550363de864be7a38c4f663836a/experiments/prediction/base_configs/gcp_adaptive.py

I unfortunately don't have the specific config used in the bottleneck experiment. I think just adding gcp_adaptive should work, but you might need to tune beta or other hyperparameters.

orybkin avatar Aug 19 '22 17:08 orybkin

Thank you for the instruction! 👍 I am trying to run the adaptive-GCP model on a new dataset called Color Changing MovingMNIST. Like MovingMNIST, In each video sequence, two digits frequently intersect and bounce off the edges of the frame. But when the digits hit the edges, they randomly change to a new colour. We are thinking of using the adaptive GCP to find bottleneck frames(ideally would be the frame where the collision happens). But I guess the off-the-shelf repository misses some supporting codes. I made some modifications to the codes but didn't get a reasonable result. Would you mind taking a look at the modification I made, and is there anything I need to set up(especially the model config)? I sincerely appreciate your help!

  1. For the Dataset.

I follow the h36m data format from the link provided and mv it under /video-gcp/gcp/data/moving_colored_mnist/hdf5/ So each video is a 30x 64 x 64 x 3 tensor on a scale of 0 to 255 and is stored in a traj.h5 file. The dataset has 6000 trajectories in total.

I create a /video-gcp/gcp/data/moving_colored_mnist/dataset_spec.py

dataset_spec = {
    'max_seq_len': 30,  # maximum sequence in dataset is 30 frames
    'n_actions': 0,  # no actions in dataset
    'state_dim': 0,  # no states in dataset
}

I create a /video-gcp/gcp/datasets/configs/moving_colored_mnist.py

from blox import AttrDict
from gcp.datasets.data_loader import FolderSplitVarLenVideoDataset

class moving_colored_mnist(FolderSplitVarLenVideoDataset):
config = AttrDict(
    dataset_spec=AttrDict(
        max_seq_len=30,
        dataset_class=moving_colored_mnist,
    ),
)
  1. For the experiment configuration. I add config file /video-gcp/experiments/prediction/moving_colored_mnist/gcp_adaptive/conf.py
import os

from blox import AttrDict

current_dir = os.path.dirname(os.path.realpath(__file__))
from experiments.prediction.base_configs import gcp_adaptive as base_conf

configuration = AttrDict(base_conf.configuration)
configuration.update({
    'dataset_name': 'moving_colored_mnist',
    'num_epochs': 600,
    'lr': 2e-4,
    'epoch_cycles_train': 30,
})

model_config = AttrDict(base_conf.model_config)
  1. The training script I used
export GCP_DATA_DIR=./data
export GCP_EXP_DIR=./experiment_logs

python prediction/train.py \\
--path=../experiments/prediction/moving_colored_mnist/gcp_adaptive/ \\
--skip_first_val=1 \\
--imepoch=1
  1. Other modifications: change https://github.com/orybkin/video-gcp/blob/4608a543fe60c550363de864be7a38c4f663836a/gcp/prediction/models/adaptive_binding/adaptive.py#L123 to losses['distance_predictor'] = BCELogitsLoss()(outputs.distance_predictor.distances, targets.float()) otherwise key error arise.

I train the model on this colored moving mnist dataset(containing 6000 trajectories) for 600 epochs. The following are the result I got from the tensorboard:

The first row are the ground truth and second row are GCP reconstruction: target_gif_train looks like GCP trying to reconstruct the mean color and mean digit shape. We're wondering if you have any ideas :) thank you!

lim142857 avatar Sep 07 '22 19:09 lim142857

Your results indeed don't seem very good. You might want to try first fitting the data with a non-adaptive model to figure out which hyperparameters to use for the model. It might be that you are using a very small network or you need to tune the beta hyperparameter. It seems like you are not using the larger networks from the paper results, you should try that. You can base your config on e.g. this https://github.com/orybkin/video-gcp/blob/4608a543fe60c550363de864be7a38c4f663836a/experiments/prediction/25room/gcp_tree/conf.py, but add your dataset and the adaptive config.

orybkin avatar Sep 07 '22 19:09 orybkin

Many thanks. I am going to try a larger network. I am wondering why we are removing "add_weighted_pixel_copy" in the 25room base config. The PixelCopyDecoder looks like a residual connection? Btw, this is the config I am going to use. Do you think this setting makes sense? :) I removed the cost function hyperparameters:

configuration = AttrDict(base_conf.configuration)
configuration.update({
    'dataset_name': 'moving_colored_mnist',
    'batch_size': 16, # 64 is the default value
    'num_epochs': 300,
    'lr': 2e-4,
    'epoch_cycles_train': 30,
    # 'metric_pruning_scheme': 'dtw', # dtw is the default value.
})

model_config = AttrDict(base_conf.model_config)
model_config.update({
    'untied_layers': True,
    'hierarchy_levels': 8,
    'ngf': 16,
    'nz_mid_lstm': 512,
    'n_lstm_layers': 3,
    'nz_mid': 128,
    'nz_enc': 128,
    'nz_vae': 256,
    'regress_length': True,
    'run_cost_mdl': False,
    #'decoder_distribution': 'gaussian', # guassian is the default value.
})
model_config.pop("add_weighted_pixel_copy")

lim142857 avatar Sep 07 '22 22:09 lim142857

You could try a residual connection but I'm guessing it won't help for mnist.

On Wed, Sep 7, 2022, 11:13 PM Cong Wei @.***> wrote:

Many thanks. I am going to try a larger network. I am wondering why we are removing "add_weighted_pixel_copy" in the 25room base config, the PixelCopyDecoder looks like a residual connection? Btw, This is the current config I am using:

configuration = AttrDict(base_conf.configuration) configuration.update({ 'dataset_name': 'moving_colored_mnist', 'batch_size': 16, # 64 is the default value 'num_epochs': 300, 'lr': 2e-4, 'epoch_cycles_train': 30, # 'metric_pruning_scheme': 'dtw', # dtw is the default value. })

model_config = AttrDict(base_conf.model_config) model_config.update({ 'untied_layers': True, 'hierarchy_levels': 8, 'ngf': 16, 'nz_mid_lstm': 512, 'n_lstm_layers': 3, 'nz_mid': 128, 'nz_enc': 128, 'nz_vae': 256, 'regress_length': True, 'run_cost_mdl': False, #'decoder_distribution': 'gaussian', # guassian is the default value. }) model_config.pop("add_weighted_pixel_copy")

— Reply to this email directly, view it on GitHub https://github.com/orybkin/video-gcp/issues/21#issuecomment-1239958911, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVGRTQVCDYCYVTN3EZO76TV5EHP3ANCNFSM57BHJ37A . You are receiving this because you commented.Message ID: @.***>

orybkin avatar Sep 07 '22 22:09 orybkin