producer_avformat performs useless work when source is intra-only and consumer framerate is lower than the source's framerate
This may or may not be considered a bug, but with ffmpeg one can set -thread_type frame -threads 16 to speed up intra-only decoding, especially for slow codecs like JPEG2000. producer_libavformat.c happily passes thread_type along, but this is purely detrimental. libavformat decodes multiple frames at once as instructed, but producer_libavformat only ever uses the first one. Playing more nicely with libavformat's threading would enable better handling of heavy to decode formats like JPEG2000. I see that recently there has been done some work to make melt would work better with intra-only, and supporting frame threading would be one way to make it work even better.
Update: this only happens when the output framerate is lower than the input framerate. For example when putting a 50 fps source into a 25 fps project. This gets worse the lower the output framerate is. I added a bit of code tracking how many packets are sent to the decoder, and for a 50 Hz intra-only input with an output rate of 10 Hz, i=0 out=199 and intra-only, a grand total of 2990 (!) packets are sent to the decoder, almost 15x overhead.
Initially I had accidentally run @sirf's fork when reporting this, but a very similar issue exists here upstream. producer_avformat actually does the right thing when the output framerate is greater than or equal to the input framerate.
Example command lines:
melt V75UPPSNACK_LORDAG_P01_195335.mov in=0 out=199 -consumer avformat:/tmp/foo.avi -> 209 packets sent to decoder. Nine extra packets decoded = no biggie
melt V75UPPSNACK_LORDAG_P01_195335.mov in=0 out=199 -consumer avformat:/tmp/foo.avi frame_rate_num=10 frame_rate_den=1 -> 2990 packets sent to decoder. Definitely an issue.
Versions: current melt master (49fcfd398dae1407cab62c04d4d9e8b950c6712d) and current ffmpeg master (249c66bb225b0671434b3ce9cc3f7935a229f428).
Oh and I have an idea for fixing this: don't bother sending packets to the decoder whose result we know won't be used. This can be done by inspecting the packet's pts. That obviously only works for intra-only, but would be a huge gain in many cases
I wrote an extremely ugly hack that just throws away 4/5 packets and gave it a go on a lossless 4k JPEG2000 sample. 200 frames -> 205 packets sent and the output looks as expected. So the idea of throwing away intra-only packets between the desired output frames has some legs. It just needs proper timestamp logic. It'll still read the packets, which isn't great on high-bitrate files like some of the JPEG2000 samples we have, but it's better than performing useless decode work.
edit: Oh and it manages to make full use of the CPU while doing this.