Shakeel Ahmad
Shakeel Ahmad
Hello, I am using your script to extract audio visual features After extracting log filterbank using `python_speech-features` of shape (96,26) and frame shape is (96,88,88) It is throwing following error...
It works, I just now saw the the same issue. Just need to add dummy argument as `python visual_features.py dummy` in the directory `av-hubert/avhubert` directory
Hi after running `python lrs3_prepare.py --lrs3 lrs3/ --ffmpeg /path/to/ffmpeg --rank 10 --nshard 20 --step 1 ` I am getting if word_intervals[-1][-1] < max_duration: IndexError: list index out of range Can...
@chevalierNoir Thank you for your answer I am using your script to extract audio visual features After extracting log filterbank using python_speech-features of shape (96,26) and frame shape is (96,88,88)...
1. If I want to get audio only features, what should be the format for audio input. Should it be a simple raw speech of shape (L,) 2. wav_file shape...
I have figured it out while extracting features separately for audio and video visual frames: `torch.Size([1, 1, L, X, Y]) ` audio_feats should be `(batch, 104, L) while using `stack_order_audio=4`...
Thank you so much @david-gimeno , it works, the dataset I am currently working on is also having 25 fps framerate Regarding other question And How to specify a particular...
Thank you so much I will open the new issue
@david-gimeno Just a quick question What do you mean by `Audio is typically extracted at 100fps`
Incase you already have parse arguments, passing dummy will throw error "error: unrecognized arguments: dummy". Just by adding ` parser.add_argument('--dummy', type=str, default=None, help='A dummy argument for future use.') ` will...