v1.0 Upcoming Changes & Migration Plan

The next major release of PySceneDetect will introduce major breaking changes. Furthermore, the minimum supported version of Python will be increased to 3.5, and OpenCV 2.x support will be deprecated. Support for the v0.5.x branch will still be provided with occasional bugfixes if required, however all new development will proceed on v1.0 once it is released.

These API changes are intended to support further development, simplify integration, and cover many more use cases by emphasizing modularity. Targeting a newer version of Python also can simplify parts of the existing codebase. Most proposed changes are to the internal PySceneDetect API (i.e. there will only be minor modification of those that show up at a high level in the quickstart example). Most expected breaking changes will occur in the SceneDetector base class and the SceneManager class and the high level usages thereof.

Users of the high level API should have a relatively smooth transition; developers using the internal APIs to implement detection algorithms will require more significant change, although a migration guide will be provided for both use cases.

All users of PySceneDetect, especially those who use the Python API, are encouraged to provide feedback on all items listed below, especially those marked as TODO. Attempts will be made to extend API functionality by providing backwards compatibility wherever possible in an attempt to ensure minimal disruption to existing programs utilizing the Python API.

New Quickstart Example

from scenedetect import SceneManager
from scenedetect.detectors import ContentDetector    # Content-aware detection (detect-content via CLI)

def find_scenes(video_path, threshold=30.0):
    # Create our scene manager, then add the detector.
    scene_manager = SceneManager(video_path)
    scene_manager.add_detector(
        ContentDetector, threshold=threshold)

    # Improve processing speed by downscaling before processing.
    scene_manager.get_video_input().set_downscale_factor()

    scene_manager.detect_scenes()    # Required to now start & reset video manager

    # Each returned scene is a tuple of the (start, end) timecode.
    return scene_manager.get_scene_list()

Breaking API Changes

Important new enumeration for below:

`EventType`

EventType.IN Fade On / Start of scene
EventType.OUT: Fade Out / End of scene
EventType.CUT: Change of Scene / Shot / Event

SceneDetector

SceneDetector method process_frame() shall returns a list of events in the form [ (frame #, EventType), ... ] (previously it was a list of cuts in the form [ frame #, ...])
Should the SceneDetector base class post_process() function be moved to a separate post-processing filter-type object? Should it remain in both?
SceneManager will call new attach_scene_manager method when being added to allow access to the VideoStream and StatsManager being used
A detector may now assume that both a VideoStream and a StatsManager is available in the parent/owning SceneManager class, if any, as an invariant
stats_manager_required() will be removed as it is no longer required
TODO: How should pure-online versus offline algorithms be distinguished? Lack of a post_process function? If so, may have to split the base detector again and use multiple inheritance.
TODO: Does post_process() need the final frame number/timecode?
TODO: Write a short migration guide for existing SceneDetectors showing how to obtain the previous arguments.
process_frame() and post_process() shall now return a list of tuples of the form (frame_num (int), event_type (EventType), confidence (float))

SparseSceneDetector

SparseSceneDetector class will be removed from the scenedetector.scene_detector module in favor of having the existing SceneDetector returning an EventType (rather than just cuts) along with it's frame number. There appears to be no utilization of this class outside of the currently undocumented MotionDetector algorithm, so this is not expected to affect any users.

MetricProvider

Will provide one or more frame metrics stored in the StatsManager for either online or offline processing
Detectors should instantiate the required metrics through the parent SceneManager, which will ensure no duplication of any metric providers across multiple detector instances (this also removes the requirement for any kind of global metric registry, instead allowing better code reuse within each SceneDetector)
If only online algorithms are used, there does not need to be any cache of metrics, reducing memory consumption
TODO: Should the metrics be retrieved through the MetricProvider instead of the StatsManager? See previous point. This is how the previous design used to be until offline algorithms were added in v0.5.x.

SceneManager

The event list shall now return all types of events (in/out/cut), and in the call to get the scene list, the events will be turned into a sequence of scenes
get_cut_list() will be removed, as the information it provides can be retrieved now from get_event_list() (by only looking at EventType,CUT events).
get_event_list() will return a list of tuples of an EventType and a FrameTimecode, rather than a pair of IN/OUT events, to allow for greater flexibility
get_scene_list() may require additional arguments to allow some kind of post-processing/filtering when generating the output scene list based on the list of detected events; this is distinct from the detection algorithm post_process function, but highly related, so any feedback in that regard would be helpful.
get_scene_list() should no longer require passing an explicit base timecode (i.e. the argument is now optional)
SceneManager will now require a VideoManager or other frame source upon construction, rather than delegating to detect_scenes() so that detectors can access information from the VideoManager itself
New get_video_manager() method to return reference to VideoManager the object was created with
Constructor will create StatsManager automatically now (allow overriding with explicit named parameter for backwards compatibility)
New get_stats_manager() method to return implicitly created StatsManager object
TODO: Add example usage to documentation. Update API documentation accordingly.

StatsManager

Add a public get_metric()/set_metric() method to allow more idiomatic calls to set/retrieve frame statistics
Consider refactoring get_metrics() to return all metrics for the frame as a dict, as this object already exists in memory (i.e. allow the metrics argument to be optional, and just return all available metrics for the frame)

FrameTimecode

~No changes are planned for v1.0 at this time~ Timecode representation will be reworked so that frames start from 1 but times start from 0 (i.e. frame 1 has presentation time 0.0 seconds), which also helps with supporting variable frame rate videos (#168)
It may be worth adding a method to VideoStream to get the current time as a float, rather than just the frame number, to support this future effort

Aug 08 '20 17:08 Breakthrough

Hi all my feedback comes from a newbie so take it from what it's worth...

In big terms, I wouldn't change the behavior of the software, but add a complementary way of detecting stuff. Some people may want to use the timecodes as and some may want to use the in, out events.

So my 1 cent answer is... if you ask either or, aim for both (easy to say, harder to code)

SceneManager Add new callback argument to detect_scenes() which will be invoked whenever a new scene has been detected (#5)

TODO: Add example usage to documentation. Update API documentation accordingly.

The more people use it, the more help you'll get. The easier it is to use, the more people will use it.

Both - TODO: Determine required changes to support event-based detectors. Instead of events representing a pair of timecodes, instead they shall be represented as in and out events. Both - TODO: Should get_event_list() be changed to return a list of sorted in/out events, or should it's return type be kept consistent as pairs of frames (thus dropping the last in event until a corresponding out event is available)?

SparseSceneDetector

SparseSceneDetector shall be renamed to EventDetector to better reflect functionality

What if you split it in 2 functions so people chose the one they prefer to use? One as is, one with the change you propose.

Instead of returning a pair of frames, EventDetectors shall instead return a pair of (frame number, event type) where event type is either begin or end (shall be made integer constants, e.g. scenedetect.event_type.begin) These changes are planned to better support live mode as well as "generator" mode, where invoking detect_scenes() on a SceneManager will return as soon as a new scene cut is detected, or any type of event is detected

If multiple EventDetectors are combined, their in events must be anded (e.g. they most both detect an in-event), and their out events should be configurable to be an and or or (default or) TODO: Should the callback be invoked on each type of event, or only on a pair of begin/end events? My thought is on each event type to allow for greater flexibility, however, if a way to accomplish both cleanly can be developed, that would be preferred.

Yes, the more options you give users, the more likely they'll think of ways to build on top of it

ThresholdDetector ThresholdDetector shall be modified to be an EventDetector rather than a regular SceneDetector Combined with the above changes, this would allow for a callback to be invoked when the threshold is crossed above/below (rather than the current design, which will only trigger the callback after a transition from below -> above -> below the threshold (i.e. latch on rising edge, trigger on falling edge)

Awesome! more ways of defining what the change was. Region based, color based, sound volume based, ...

I hope I haven't wasted your time... I did want to give my 2 cents.

Aug 12 '20 17:08 santiagodemierre

Hi @santiagodemierre;

In big terms, I wouldn't change the behavior of the software, but add a complementary way of detecting stuff. Some people may want to use the timecodes as and some may want to use the in, out events.

The end goal of these changes are definitely to support both of those use cases. Just to clarify, when you say use the timecodes and some may want to use in/out events, are you referring to the calls to get_event_list() and get_scene_list() in SceneManager?

Currently, scenes and events are represented differently internally - the goal of these changes is to represent everything as an event, and move the logic for actually creating scenes out of cuts from the detectors to the SceneManager's get_scene_list() method (rather than having individual detectors do that logic).

There won't be any removal of functionality from the SceneManager - the existing API will be modified to support both use cases. These breaking changes are just moving some of the post-processing stages from the actual individual detection algorithms to the SceneManager when you try to obtain the actual events/scene list.

The end goal will be that SceneManager will return you a list of scenes (pairs of FrameTimecodes or frame numbers) or a list of events, as it does today - the only changes will be some of the arguments to the existing methods.

What if you split it in 2 functions so people chose the one they prefer to use? One as is, one with the change you propose.

Sorry, could you expand a bit on this point? My idea was that the arguments you pass would dictate how the scenes get generated based on events - does this align with your thoughts?

Thanks for the feedback!

Aug 13 '20 23:08 Breakthrough

Sorry @santiagodemierre;

I also realize the changes I wrote above don't reflect the actual direction - I've revised them accordingly, my apologies! Any new feedback would be so useful, thank you!

In essence, I want to provide both - I want to give a list of events (in, out, and cut), as well as a list of timecodes. Hopefully this aligns with what you were proposing (since now the callback will be invoked in all cases - on rising edge, falling edge, and on fast cut, and there is a single base class that all detectors can now share).

Thank you!

Aug 14 '20 02:08 Breakthrough

As somebody that uses the Python API, I have a couple questions about these changes.

First, if I understand your proposed changes correctly, the change to detect_scenes won't have any impact on analyzing standalone videos currently. It will just return the total number of frames processed. The callback function however, could be invoked even in a non-livestream use case as well. Similarly, will the get_scene_list function still return a list of (start_time, end_time) tuples?

I think removing the SparseSceneDetector class and post_process function makes sense. One question about how this functionality will be brought into the SceneManager class is will a post_process-like function be added and called whenever the get_scene_list, get_cut_list, or get_event_list functions are invoked? Or will it be called at the end of detect_scenes? I have never really used the ThresholdDetector much so I am not too familiar with the need for post processing, but I think it makes more sense to invoke the new post_process at the end of detect_scenes so that it would be possible to access the internal class variable like _cutting_list or _event_list and have them be complete even if the corresponding get function is never invoked.

One final question about these changes is the impact it will have on the StatsManager. I am guessing that the detected events for each frame will be included as a metric for the StatsManager. This would enable fast re-analysis using the new API. This means making sure that events are output in the stats file csv. However, would it be possible to have multiple events occur on the same frame and how would that be handled? For example, if you were using more than one detector, a threshold and content detector, and a fade-in was detected on the same frame as a hsv jump triggering the content detector, would two different events get written to the stats manager? Off-topic, but this is a scenario I have been thinking about for a while, because min_scene_len is defined on a per-detector basis. So, if you want to use multiple detectors, they can detect scenes independently inside that specified min_scene_len parameter because they don't see what the other detector is doing. EDIT I see this has already been added to the v0.6 milestone in #131

Dec 01 '20 14:12 wjs018

Thanks for the feedback @wjs018 - as per our discussion in #153, definitely need to revisit my plans for post_process. See my recent comment there, looking forward to any feedback you might have on the matter. I don't think this blocks getting in PR #198, but that definitely makes me want to revisit these API changes to support that best.

As for outputting the events to the statsfile, I don't think this is actually necessary since the determination of an event should be from the given metrics stored in the file, plus the parameters passed to the detector (i.e. the event type for a given frame should be able to be inferred from the metrics in the statsfile). If you think this isn't a good assumption to make though, then I'm definitely open to considering adding it. As you mentioned though, there are some edge cases to think through with that option, so I'd like to try and avoid it unless absolutely necessary.

Jan 16 '21 02:01 Breakthrough

is the confidence score of the split available somewhere when using detect_scenes?

Feb 03 '21 18:02 segalinc

Hi @segalinc;

While not provided by the API directly, assuming you're using the ContentDetector algorithm, you could derive this information by using a StatsManager to obtain and normalize the content_val metric (delta HSV from previous frame). The value is the floating point difference from 0.0-255.0, so you can divide this by 255.0 to obtain a normalized score (this corresponds with the --threshold argument on the command line).

Are you just looking for a confidence score of the scene cut itself, or the confidence of there being a split for every frame in the video? As mentioned, the latter can be obtained using a StatsManager, however I can definitely see the use case for returning it with each split (thus accessible through SceneManager after calling detect_scenes()). This would require that the process_frame() for each SceneDetector returns a list of events in the form [ (frame #, EventType, confidence score), ... ], but I'm open to including this change in the upcoming v0.6 release.

This implies that a kind of "detection result" class should be created with named fields to encapsulate all of the data associated with events that detection algorithms produce, rather than just returning a tuple. Then get_event_list() could return these objects directly, and get_scene_list() would just return a pair of confidence scores for the beginning and end of the scene (resolving the ambiguity there of calculating a confidence score for the frame as a whole, leaving that to the end user). Does that sound reasonable at least?

Thanks for the question/suggestion, any feedback is most welcome.

Feb 03 '21 23:02 Breakthrough

Hi,

Thanks for the detailed answer. I think a confidence score similar to what you get using ffprobe should work, so either a score for both start and of the cut as you mentioned or of the full shot. As possible return list you could do (start,end,score,fps) so that then it's also easy to convert it to the timeencoded type

Keep me posted!! Thank you!

Cristina

Sent from my OnePlus

On Wed, Feb 3, 2021, 15:59 Brandon Castellano [email protected] wrote:

Hi @segalinc https://github.com/segalinc;

While not provided by the API directly, assuming you're using the ContentDetector algorithm, you could derive this information by using a StatsManager to obtain and normalize the content_val metric (delta HSV from previous frame). The value is the floating point difference from 0.0-255.0, so you can divide this by 255.0 to obtain a normalized score (this corresponds with the --threshold argument on the command line).

Are you just looking for a confidence score of the scene cut itself, or the confidence of there being a split for every frame in the video? As mentioned, the latter can be obtained using a StatsManager, however I can definitely see the use case for returning it with each split. This would require that the process_frame() for each SceneDetector returns a list of events in the form [ (frame #, EventType, confidence score), ... ], but I'm open to including this change in the upcoming v0.6 release.

This implies that a kind of "detection result" class should be created with named fields to encapsulate all of the data associated with events that detection algorithms produce, rather than just returning a tuple. Then get_event_list() could return these objects directly, and get_scene_list() would just return a pair of confidence scores for the beginning and end of the scene (resolving the ambiguity there of calculating a confidence score for the frame as a whole, leaving that to the end user). Does that sound reasonable at least?

Thanks for the question/suggestion, any feedback is most welcome.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Breakthrough/PySceneDetect/issues/177#issuecomment-772911609, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHLPES5PFL4TOGGY7J2B7TS5HPMPANCNFSM4PYYL2CQ .

Feb 04 '21 01:02 segalinc

Closing this issue out, as it's preferable to slowly push towards a more stable API rather than make a rapid breaking change like this. Each subsequent release should bring us closer to what we desire here.

Jul 17 '23 00:07 Breakthrough

PySceneDetect
PySceneDetect copied to clipboard

v1.0 Planned API Changes & Feedback

v1.0 Upcoming Changes & Migration Plan

New Quickstart Example

Breaking API Changes

`EventType`

SceneDetector

SparseSceneDetector

MetricProvider

SceneManager

StatsManager

FrameTimecode

What if you split it in 2 functions so people chose the one they prefer to use? One as is, one with the change you propose.

Yes, the more options you give users, the more likely they'll think of ways to build on top of it

Awesome! more ways of defining what the change was. Region based, color based, sound volume based, ...

PySceneDetect PySceneDetect copied to clipboard

v1.0 Planned API Changes & Feedback

v1.0 Upcoming Changes & Migration Plan

New Quickstart Example

Breaking API Changes

EventType

SceneDetector

SparseSceneDetector

MetricProvider

SceneManager

StatsManager

FrameTimecode

What if you split it in 2 functions so people chose the one they prefer to use? One as is, one with the change you propose.

Yes, the more options you give users, the more likely they'll think of ways to build on top of it

Awesome! more ways of defining what the change was. Region based, color based, sound volume based, ...

PySceneDetect
PySceneDetect copied to clipboard

`EventType`