GraVi-T icon indicating copy to clipboard operation
GraVi-T copied to clipboard

Pretrained weights

Open BStephen99 opened this issue 1 year ago • 3 comments

Hi there,

Thank you for this excellent work!!!

Would it be possible for you to share the pre-trained weights for the AVA active speaker model?

It would be much appreciated. :-)

BStephen99 avatar May 16 '24 14:05 BStephen99

Due to Intel's policy, we do not have an immediate plan to share the pre-trained weights. We ask you to train the model on your end since the whole training process takes less than a few hours.

Thank you, Kyle

kylemin avatar May 19 '24 18:05 kylemin

Thanks for the response. Seems a bit silly, since they can be recreated. In any case, thanks again for making this repository available. 😁

BStephen99 avatar May 21 '24 07:05 BStephen99

Sorry to bother you again, but I'm having trouble recreating the training features in RESNET18-TSM-AUG. Using the active-speakers-context repository, I swap the existing model for your models_stage1_tsm.py model, use the pretrained weights and I execute STE_forward.py with number of frames=11. The resulting features are not however the same as yours. Did you use all default parameters? The only other change I made was to reshape the video_data so it would be compatible with your model. (The input video data shape is (1, 11, 3, 144, 144) and the audio is (1,1,13,40). Does that seem correct?)

BStephen99 avatar May 22 '24 10:05 BStephen99