openshot-qt WIP: classes/waveform.py: don't process every frame

Based on the use of "seconds" in the local variable names, I gather that the intention of this function is to get a few audio samples from each second of data. However, what it was actually doing is getting that many samples from each frame of data, which is way more data, resulting in the process taking an unreasonably long time.

#3568

Jun 16 '20 21:06 srwalter

This simplification will process each N-th frame, instead of N-th sample.

For example, for 20 samples per second and 20 fps source, we need to take 1 sample from each frame. But this algorithm will take samples (all 20) only from each 20-th frame.

Yeah, feature itself is slow, and as soon as 32767 width is upper limit for display anything on the Timeline ( https://github.com/OpenShot/openshot-qt/issues/652 ), no sense to use it at all (for large files, where it take too much time to generate the waveform image).

Jun 17 '20 05:06 SuslikV

Correct, with this change it takes the samples from one frame each second, rather than distributed evenly throughout the second. From my experimentation it appears that GetFrame is quite expensive, especially on a video (versus WAV) clip. Getting multiple samples within a frame seems relatively quite cheap.

I'm not sure I follow your statement about 32767 width.

Jun 17 '20 14:06 srwalter

I'm not sure I follow your statement about 32767 width...

For short clips it doesn't take much time to create new waveform. And for long clips it is unusable, because result image is not shown.

Jun 17 '20 16:06 SuslikV

What do you consider a long clip? Locally I'm using a 2.5 hour long clip and it does display. Are you saying that the change for fix for #652 will further break that? That sounds like a bug in and of itself, so why bring that up in this discussion?

Jun 17 '20 16:06 srwalter

I'm using a 2.5 hour long clip and it does display...

Less than 15 sec scale? Hardly believable that even seen that bug before. Usually, to cut the video you need at least 5 sec (per 100 px) in OpenShot or you may miss the spot.

that sounds like a bug in and of itself, so why bring that up in this discussion?..

I want to say, that feature don't require attention. Yes it is slow, but improving this will only give to the users false hope that this can work and they can use it. Right now, it is slow and everyone, who tried it at least once, will never will do it again (or will think twice and will choose Audacity for audio edits).

Jun 17 '20 17:06 SuslikV

Okay but why wouldn't you want this feature to be good?

Jun 17 '20 17:06 srwalter

The scale doesn't seem to have anything to do with it. I clicked "Display -> Waveform" at 15s scale. When it finished rendering, I could zoom in all the way to 1s and it was perfectly responsive. Similarly I could zoom all the way out to see the entire 2h40min clip at once and it had no problem with this.

Jun 17 '20 18:06 srwalter

I clicked "Display -> Waveform" at 15s scale. When it finished rendering, I could zoom in all the way to 1s

@srwalter after the zoom-in, scroll Timeline to the end of the clip to find point where image (waveform) actually ends.

Jun 18 '20 12:06 SuslikV

Let's look at the algorithm.

For example, we have audio of 48000 samples per second. Stream should be read straight forward without skipping until stream ends. The max number of samples needed to build any waveform is about 100 per 1 second (1 sec min zoom = 100 px, hardcoded value in OpenShot). Thus, only each 48000/100=480-th sample of uncompressed audio should be "cached". For, let's say, 72 hours of video this is about 124416000 Bytes of data, On the screen, only 1024...4096 Bytes of this array needs to be displayed each time. Also not all videos are 72 h long...

Jun 18 '20 12:06 SuslikV

Okay I see what you're saying about the timeline being truncated at high zoom. It appears that's true not just for the waveform, but even the timeline scale!

Perhaps it would be helpful to explain how I'm using this feature, and why I didn't notice this in my actual use. I'm just using the waveform to find areas of silence to cut out of the video. I never zoomed in that close because I would just drop the playhead just before (or after) silence, and then watch/listen until I found a good cut point.

Jun 18 '20 14:06 srwalter

for t in range(1, int(clip.Reader().info.duration)): duration == time in seconds video_length == number of frames

I'm not sure this does what you think.

Oct 16 '20 07:10 jonoomph

Merge conflicts have been detected on this PR, please resolve.

Jun 29 '22 23:06 github-actions[bot]

Merge conflicts have been detected on this PR, please resolve.

Jul 20 '22 07:07 github-actions[bot]

Closing, as waveform extraction has been improved since this PR was created

Jan 10 '23 22:01 jonoomph