i3d-tensorflow
i3d-tensorflow copied to clipboard
Request for fine-tune details~
Hi, yoosan! Recently, I am doing research on I3D, wanting to make the recurrence of this architecture. But I am confused when I am faced with fine-tuning the pre-trained model on UCF-101, whose result only reach 90.8% when input is only RGB. Can you teach me the details of fine-tuning this model? Thanks in advance!
Hi Alex, For some reason I can't release the training code now! But I provide the training and testing script here for helping you understand how to reproduce the results.
# Script for training videos in TensorFlow
now=$(date +"%Y%m%d_%H%M%S")
python=/mnt/lustre/DATAshare/zhouyao/anaconda3/bin/python
export LD_LIBRARY_PATH=/mnt/lustre/share/cuda-8.0-cudnn6/lib64/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/mnt/lustre/share/mvapich2-2.2b-cuda8.0/lib/:$LD_LIBRARY_PATH
if [ ! -d "../log" ]; then
mkdir "../log"
fi
num_gpus=4
machines=GTX1080
# Dataset setting
subset=train
data_dir=/mnt/lustre/DATAshare/ucf101_tfrecord/split1/train
data_set=UCF101Data_1
# Input setting
input_type=RGB
crop_fn=TSNCrop
num_segments=2
num_length=16
max_distort=1
more_crop=True
frame_size=224
data_format="NHWC"
# Model setting
# SenseTime_I3D model
net=i3d_v1
model=C3DModel
# Training setting
batch_size=8
optimizer=momentum
learning_rate=0.001
learning_rate_decay_factor=0.1
max_steps=25000
num_steps_per_decay=10000
weight_decay=0.0005
label_loss=CrossEntropyLoss #CrossEntropyLoss, FocalCELoss
dropout_keep=0.5
# Special training
pretrain_dir=/mnt/lustre/DATAshare/model-zoo/zhouyao/i3dkinetics50000
checkpoint_exclude_scopes=v/SenseTime_I3D/Logits
special_layer=Logits
layerwise_lr_ratio=10
# Logging
tf_summary_image=True
jobname=I3D-UCF
train_dir=../log/${data_set}-${input_type}-$model-${net}-train_dir
srun -p ${machines} --job-name=${jobname} --gres=gpu:${num_gpus} \
${python} -u ../train_val.py --local_parameter_device=cpu --num_gpus=${num_gpus} \
--data_set=${data_set} --subset=$subset --data_dir=${data_dir} --num_steps_per_decay=$num_steps_per_decay\
--num_segments=${num_segments} --data_format=${data_format} --optimizer=${optimizer}\
--pretrain_dir=${pretrain_dir} --checkpoint_exclude_scopes=${checkpoint_exclude_scopes}\
--model=$model --variable_update=parameter_server --special_layer=${special_layer} \
--base_model_name=${net} --weight_decay=${weight_decay} --layerwise_lr_ratio=${layerwise_lr_ratio} \
--batch_size=${batch_size} --input_type=${input_type} --label_loss=$label_loss \
--learning_rate=${learning_rate} --learning_rate_decay_factor=${learning_rate_decay_factor} \
--num_length=${num_length} --max_distort=${max_distort} --more_crop=${more_crop} --max_steps=${max_steps} \
--frame_size=${frame_size} --raw_video_short_side=${raw_video_short_side} --train_dir=${train_dir} \
--num_preprocess_threads=16 --num_readers=10 --dropout_keep=${dropout_keep}\
2>&1|tee ../log/tf-train-$data_set-${input_type}-$model-${net}.log &\
The testing setting is also provided here
# Script for evaluating videos in TensorFlow
now=$(date +"%Y%m%d_%H%M%S")
python=/mnt/lustre/DATAshare/zhouyao/anaconda3/bin/python
export LD_LIBRARY_PATH=/mnt/lustre/share/cuda-8.0-cudnn6/lib64/:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/mnt/lustre/share/mvapich2-2.2b-cuda8.0/lib/:$LD_LIBRARY_PATH
if [ ! -d "../log" ]; then
mkdir "../log"
fi
num_gpus=4
machines=GTX1080
# Dataset setting
subset=validation
data_dir=/mnt/lustre/DATAshare/ucf101_tfrecord/split1/validation
data_set=UCF101Data_1
# Input setting
input_type=RGB
crop_fn=TSNCrop
num_length=32
max_distort=1
more_crop=True
frame_size=224
num_segments=3
data_format="NHWC"
# Model setting
net=i3d_v1
model=C3DModel
# Eval setting
batch_size=2
max_steps=25000
# Logging
tf_summary_image=True
jobname=eval-${data_set}
train_dir=../log/${data_set}-${input_type}-$model-${net}-train_dir
eval_dir=../log/${data_set}-${input_type}-$model-${net}-eval_dir
# Slurm running
srun -p ${machines} --job-name=${jobname} --gres=gpu:${num_gpus} \
${python} -u ../train_val.py --local_parameter_device=cpu --num_gpus=${num_gpus} \
--data_set=${data_set} --subset=$subset --data_dir=${data_dir} --top_k=5 \
--num_segments=${num_segments} --data_format=${data_format} \
--pretrain_dir=${pretrain_dir} --checkpoint_exclude_scopes=${checkpoint_exclude_scopes}\
--model=$model --variable_update=parameter_server --eval=True \
--base_model_name=${net} \
--batch_size=${batch_size} --input_type=${input_type} --max_steps=${max_steps} \
--num_length=${num_length} --max_distort=${max_distort} --more_crop=${more_crop} \
--frame_size=${frame_size} --train_dir=${train_dir} --eval_dir=${eval_dir}\
2>&1|tee ../log/tf-eval-$data_set-${input_type}-$model-${net}.log &\
Thanks for your reply! I will try your method and give you feedback~