kinetics-i3d Video preprocessing code

Is there a way you could share the code you use to preprocess the videos? I mean the aplication of the TVL1 optcal flow algorithm to look like the example gif you show us. I've been triying to replicate the preprocessing by myself, but it doesn't look exactly like yours, so when i do inference always give me wrong results.

Thank you Greetings

Nov 19 '19 09:11 javithe7

This is a .ipynb file to convert images sampled from the video to optical flow images. Hope this helps convert_to_flow.ipynb.zip

Nov 21 '19 12:11 jahab

Thank you vey much @jahab , i've tested your code with a few changes and the result looks pretty good, however the background of the video is changing colors, unlike the example gif that maintains a static gray color background . This is a gif with de result :Video. Maybe they applied some extra filter to the video to achieve that kind of background?

Nov 22 '19 10:11 javithe7

@javithe7 The color is I believe for the purpose of visualization for the user. The output we are taking from the optical flow is just the first two channels. These two channels are then passed to the video classification file. I have further modified the code to add the [-20 , 20] truncation. Have a look at that as well. Attaching the updated file here convert_to_flow.ipynb.zip

Nov 22 '19 11:11 jahab

@javithe7 The color is I believe for the purpose of visualization for the user. The output we are taking from the optical flow is just the first two channels. These two channels are then passed to the video classification file. I have further modified the code to add the [-20 , 20] truncation. Have a look at that as well. Attaching the updated file here convert_to_flow.ipynb.zip

Hello, i find the norm_flow function in your code, and use it to norm my optical flows. But the predicted results from flow model are still different from the results of sample predictions.
And i notice that the provided v_CricketShot_g04_c01_flow.npy‘s min_value = -0.46 and max_value = 0.328, not from -1.0 to 1.0. They maybe use other preprocessing methods.?

Top 5 classes and associated probabilities:(RGB) [playing cricket]: 9.999976E-01 [catching or throwing baseball]: 8.820192E-07 [playing kickball]: 5.095237E-07 [hitting baseball]: 2.219674E-07 [catching or throwing softball]: 1.566631E-07 Top 5 classes and associated probabilities:(Flow) [rock climbing]: 4.467638E-01 [grinding meat]: 3.675056E-01 [driving car]: 6.612947E-02 [pushing car]: 1.817184E-02 [water sliding]: 1.372696E-02 ===== Final predictions ==== logits proba class 2.343790e+01 9.997378e-01 playing cricket 1.332426e+01 4.051264e-05 faceplanting 1.314688e+01 3.392776e-05 kicking soccer ball 1.278005e+01 2.350939e-05 skateboarding 1.273522e+01 2.247873e-05 pushing car

May 13 '20 08:05 Jockey721

@Jockey721

And i notice that the provided v_CricketShot_g04_c01_flow.npy‘s min_value = -0.46 and max_value = 0.328, not from -1.0 to 1.0. They maybe use other preprocessing methods.?

I was also confused about this, but after further investigation into the mediapipe repository to find the flow preprocessing, it looks like they rescale the flow data using -20 and 20 as the min/max, instead of the min/max of the actual flow data.

Jan 27 '21 19:01 mrdaly

kinetics-i3d kinetics-i3d copied to clipboard

Video preprocessing code

kinetics-i3d
kinetics-i3d copied to clipboard