PySceneDetect icon indicating copy to clipboard operation
PySceneDetect copied to clipboard

Luma Histogram Detector

Open ash2703 opened this issue 2 months ago • 9 comments

This PR introduces several key enhancements to #295 histogram-based scene change detection feature based on discussion in #53 Leveraging OpenCV's capabilities to improve performance and accuracy significantly. These changes are aimed at optimizing the detection algorithm by reducing its sensitivity to non-essential variations such as lighting and scale changes, while better focusing on meaningful differences in the scene content.

    Command: 
        detect-hist
    
    Arguments:
        --threshold, -t
            Threshold (float) that must be exceeded to trigger a cut.
        
        --min-scene-len, -m
            Same as other detectors

Key Changes:

  • Luma Channel Utilization: Switched the histogram calculation to operate solely on the luma (Y) channel of the YCbCr color space. This focus on luma enhances the algorithm's robustness against false positives triggered by lighting variations, making the detection more reliable under different shooting conditions.

  • OpenCV integration: Replaced previous NumPy-based methods with OpenCV functions, which are specifically optimized for image processing tasks. This shift has not only simplified the code but also resulted in a performance improvement, with the new approach being at least 10 times faster than the prior implementation.

  • Histogram Normalization Introduced normalization of histograms to address potential issues related to changes in intensity and image scaling. Normalizing histograms ensures that comparisons are based on the distribution of pixel intensities rather than their absolute counts, which is crucial for consistent scene change detection across varying frame conditions.

  • Comparison Methodology Modified the histogram comparison strategy from a simple intensity difference check to a distribution-based comparison. This method uses the correlation coefficient to evaluate the similarity between histograms

Benefits

  • Performance: The move to OpenCV makes this atleast 10x faster (Compared on few videos, awaiting any benchmarking dataset if available)
  • Accuracy: By focusing on the luma component and normalizing histograms, the detection algorithm is less likely to be sensitive to minor lighting changes or camera adjustments.

Suggestions: Incorporate EMA of histogram and compare current frame hist with EMA. This will help with short term fluctuation in videos while filtering out minor variations due to noise, camera adjustments, or temporary shifts in lighting conditions

ash2703 avatar Apr 17 '24 05:04 ash2703

This is on albanie/shot-detection-benchmarks

gameshow.mp4: 1.51 s ± 19.3 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)`

football.mp4: 2.15 s ± 22 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)

movie.mp4: 984 ms ± 343 µs per loop (mean ± std. dev. of 3 runs, 1 loop each)

The benchmarking is run on Apple M3 Pro Will run other methods to produce a comparable benchmark

ash2703 avatar Apr 17 '24 12:04 ash2703

I needed advice on thresholding logic, thinking of going beyond a hardcoded value Is there any way to observe the histogram comparison and do adaptive thresholding based on past values

I tried something like observing the mean and std. dev. of a historical diff value, and if the current difference deviates k times of this value we can trigger scene change, but if you have already worked on anything similar please suggest

ash2703 avatar Apr 22 '24 14:04 ash2703

Re: thresholding, this comment here may be of interest: https://github.com/Breakthrough/PySceneDetect/issues/35#issuecomment-1208573501

Hopefully that is of some value to you. In the meantime, as long as the algorithm gives comparable performance to the existing ones, we can work on that in a follow-up. I've added a slightly more robust filter in the develop branch I plan on integrating with this detector once your PR lands, but it's nothing fancy nor does it use any kind of statistical methods.

If you need assistance with that would be glad to help out, it might be worth hacking into the detector a way to export the raw data into a statsfile so it can be plotted for later analysis. Other detectors do this using a StatsManager.

Breakthrough avatar Apr 23 '24 00:04 Breakthrough

Got a chance to look through the code, but I can't actually test it for another day or two. Couple thoughts:

  • I really like the use of a correlation between sequential histograms as a way to normalize resolution differences.
  • The cli needs to be updated to reflect new options (I can work on this if needed).
  • Some other updates to work on would be docs and unit tests.

Something to note is that this requires OpenCV. So, if the user is intending on using a different backend like PyAV, then they would still need OpenCV installed if they intended on using this detector. Not sure if we want to put in a check to make sure OpenCV is available as part of this or even have a fallback method that doesn't rely on OpenCV (but is much slower). Also, this would need to be called out in the docs.

wjs018 avatar Apr 24 '24 04:04 wjs018

Something to note is that this requires OpenCV. So, if the user is intending on using a different backend like PyAV, then they would still need OpenCV installed if they intended on using this detector. Not sure if we want to put in a check to make sure OpenCV is available as part of this or even have a fallback method that doesn't rely on OpenCV (but is much slower). Also, this would need to be called out in the docs.

PySceneDetect now requires you to have OpenCV installed. Plenty of other detectors and functionality already require it (e.g. detect-content, save-images), so this shouldn't be too much of an issue.

I'm curious how this detector will handle more difficult material like over/underexposures. I suspect it will handle fast camera movement quire well though... Wanted to mention this just to start thinking about some ideas about if and how the histogram differences should be used/filtered (e.g. like detect-adaptive).

Breakthrough avatar Apr 27 '24 00:04 Breakthrough

Unrelated to this PR but something that may be of interest to you both, I've gotten different results on ARM64 (Apple M-chips) and x64 for some OpenCV and Numpy operations. They don't differ much, but can cause some test cases to fail under different targets.

I'm not sure if this has to do with differences in how the video is decoded (e.g. if there are small differences in the decoded frames), floating point rounding errors (unlikely but possible), bugs in the binary distributions, or something else. I've made a note to see if I can reproduce this under Linux on an arm64 emulator, but thought it was interesting nonetheless. I am not really familiar with the M1/M3 chips, but would love to hear any ideas about why it's happening, and if it can be compensated for.

Breakthrough avatar Apr 28 '24 01:04 Breakthrough

Sorry did not get time to look into this, will try to add some more commit by this week end.

it will handle fast camera movement quire well though

Yes, this works really well on fast movements but I also see some sensitivity due to shaky camera, not sure if this is of concern in professional videos but was concerning for my use case around UGC videos.

ash2703 avatar Apr 29 '24 18:04 ash2703

I was unaware that the Y channel isn't as affected during changes in lighting (Edit: how does that compare with HSL?).

Both YCbCr and HSL, the Y and L channels are designed to isolate lighting, but in practical use cases YCbCr closely reflects human visual system’s varying sensitivity to different colors, emphasizing brightness details over color details.

To verify, you can load an image and change the color spaces, vary the Y and L channels, in HSL you will observe not just the brightness but color saturation also varies.

def adjust_channel(data, channel_index, factor):
    adjusted = data.copy()
    adjusted[:, :, channel_index] = np.clip(adjusted[:, :, channel_index] * factor, 0, 255)
    print(np.array_equal(adjusted, data))
    return adjusted

# Factors for adjustments
factors = [0.2, 0.5, 1, 1.5, 1.7]

adjusted_images

ash2703 avatar Apr 29 '24 19:04 ash2703

Awesome, good to know, thanks for sharing! Do you think you'll be able to make the requested changes for this PR this weekend? I'd like to get this landed as soon as possible so we can start integrating and testing.

If not no worries - I'm happy to accept this PR as-is and fix the build in a follow-up, just let me and @wjs018 know how you would like to proceed. Really appreciate your help on this.

Breakthrough avatar May 10 '24 00:05 Breakthrough