VideoMS icon indicating copy to clipboard operation
VideoMS copied to clipboard

Kinetics Self-supervised Checkpoint

Open fmthoker opened this issue 1 year ago • 3 comments

Dear Authors, We are conducting a study to evaluate Video Self-Supervised models holistically and would like to include your EVEREST model too. Can you please share the Kinetics-400 pretrained VIT-B checkpoint for our evaluation?

fmthoker avatar Jun 18 '24 12:06 fmthoker

Thanks for your interest in our work. You can download pre-trained and fine-tuned checkpoints from the link below. We will update the code and release the other checkpoints soon! :)

https://drive.google.com/drive/folders/1Lltf4m4YjUZwEVYfAVhRRTGoqJHd1Lpp?usp=drive_link

sunilhoho avatar Jun 19 '24 15:06 sunilhoho

@sunilhoho Thanks for sharing the models at such quick notice, do you any checkpoints for Kinetics-400 VIT-B which was trained for more than 200 epochs, we want to use the best model possible for EVEREST.

Also, I started a Kinetics-400 pretraining using the code you have already shared, i want to train your model for 800 epochs:

Here is my script and hyper parameters that i am using: Can you confirm that you used the same settings: i am using 2 nodes with 4 gpus to match your 8 GPU setup.

JOB_NAME=$1 GPUS=${GPUS:-8} GPUS_PER_NODE=${GPUS_PER_NODE:-4} CPUS_PER_TASK=${CPUS_PER_TASK:-8} SRUN_ARGS=${SRUN_ARGS:-""} PY_ARGS=${@:2}

srun -p --job-name=${JOB_NAME}
--gres=gpu:${GPUS_PER_NODE}
--ntasks=${GPUS}
--ntasks-per-node=${GPUS_PER_NODE}
--cpus-per-task=${CPUS_PER_TASK}
--kill-on-bad-exit=1
${SRUN_ARGS}
python -u run_ms_pretraining.py
--data_path ${DATA_PATH}
--mask_type motion-centric
--motion_centric_masking_ratio 0.7
--mask_ratio 0.9
--model pretrain_videoms_base_patch16_224
--decoder_depth 4
--lr 3e-4
--batch_size 128
--num_frames 16
--sampling_rate 4
--opt adamw
--opt_betas 0.9 0.95
--warmup_epochs 40
--epochs 801
--save_ckpt_freq 2
--log_dir ${OUTPUT_DIR}
--output_dir ${OUTPUT_DIR}

fmthoker avatar Jun 19 '24 20:06 fmthoker

We do not have a checkpoint for the Kinetics-400 VIT-B model trained for more than 200 epochs.

I reviewed your script and hyperparameters, and they look correct. Your settings seem aligned with our original configuration except for training epochs and the number of GPUs per node. Feel free to reach out if you encounter any issues or have further questions.

sunilhoho avatar Jun 21 '24 09:06 sunilhoho