PySceneDetect icon indicating copy to clipboard operation
PySceneDetect copied to clipboard

Use PySceneDetect with GPUS

Open sam09 opened this issue 5 years ago • 15 comments

Add instructions to compile pyscenedetect to use GPUs.

sam09 avatar Sep 07 '18 21:09 sam09

Hi @sam09;

I don't believe the current implementation of PySceneDetect can be GPU accelerated just yet. There is a pull request in the works, however, that may allow this by using the native OpenCV methods instead of numpy. I haven't looked into using CUDA/OpenCL for the OpenCV Python module, but will definitely look into it (unless there's something for numpy I'm unware of?).

Ideally the core of PySceneDetect should (may) be written in C++ which would allow for integration with GPGPU constructs, but this is something that would have to be planned for a future version as the application sits. That being said, I am definitely interested in pursuing this as an option.

Thanks for the submission!

Breakthrough avatar Sep 08 '18 01:09 Breakthrough

Hi @Breakthrough. First of all apologies for the vague issue that I created. It was late night and I was really sleepy. And thanks for the a detailed response.

I tried compiling OpenCV to use CUDA (Some errors there). At the moment I think that is the best we can do. I don't think numpy has an option to use GPUs at all.

All things said, really great tool.

sam09 avatar Sep 08 '18 05:09 sam09

Haha no worries @sam09, it happens - thanks for the feedback.

I would rather use OpenCL in retrospect, because I would like the software to run faster on all GPUs, including Intel/AMD, not just Nvidia. That being said, I did remember PyOpenCL, which seems a lot more mature than the last time I looked at it.

This is definitely a route worth investigating I would say, as my current plans for performance improvement hinged on rewriting parts of the core in C++ to achieve eventual multithreaded/GPGPU support. In retrospect, however, if I can go right to GPGPU support, I may be able to keep everything written in Python via PyOpenCL.

I'll leave this issue open for discussion, and will definitely be further investigating this option. If you or anyone else has any suggestions on the matter, please feel free to share!

Breakthrough avatar Sep 15 '18 02:09 Breakthrough

@Breakthrough

unless there's something for numpy I'm unware of?

PyTorch can work as a GPU replacement for numpy. Their syntax is similar, and the ability to convert PyTorch tensors to numpy arrays works really well too. But, this will work only on Nvidia, at least for now.

rsomani95 avatar Jun 06 '19 09:06 rsomani95

@rsomani95 gotcha, will make a note of that, thanks.

My updated plan is to create a new tool called SceneStats written in C++ that does all the heavy lifting that PySceneDetect does for frame-by-frame analysis.

The idea is that PySceneDetect will invoke SceneStats to create a statsfile, with the final scene detection still be done in Python - however now it just has to load the statsfile and perform some simple data analysis to look for scene cuts, instead of having to analyzing the video in Python as well. I'm hoping this architecture will allow more flexibility and not put too many dependencies on end users whom don't require them.

If you have any comments/suggestions on this regard please feel free to leave them here. I'll leave this issue open until integration with PySceneDetect and SceneStats is complete, at which point this issue can be repoened in the SceneStats issue tracker instead. (I'm not sure if it makes more sense to follow a strictly GPGPU implementation, or if a pipelined multicore approach has the best performance gains - will need to write some benchmarks once SceneStats is up and running)

Breakthrough avatar Jun 06 '19 22:06 Breakthrough

@Breakthrough Using a C++ library to do all the heavy lifting is probably a great idea as that allows anyone to write an extension in any frontend language. I had some ad-hoc code that did something similar in CUDA and C++. Only in retrospect did I realise that most of the performance gain I achieved was from using hardware decoders to decode the video.

I think what you are proposing is pretty cool!

sam09 avatar Jun 07 '19 07:06 sam09

@Breakthrough, that sounds like a great way to go forward, both w.r.t dependencies and efficiency.

rsomani95 avatar Jun 07 '19 13:06 rsomani95

@sam09 interesting, thanks for the response - I assume by hardware decoder you mean the GPGPU? Was most of the performance gain due to not having to transfer the decoded frames from the CPU to the GPU?

@rsomani95 thanks as well for the reply!

Breakthrough avatar Jun 10 '19 00:06 Breakthrough

@Breakthrough Yes. decoding the frames on GPGPU and then processing it on the GPU itself gets rid of the memory transfer between CPU and GPU. It also frees up the CPU.

https://en.wikipedia.org/wiki/Nvidia_NVDEC

sam09 avatar Jun 10 '19 05:06 sam09

Closing this issue as won't-fix due to the current plan for performance improvements and optimization being the SceneStats project.

If you have any comments/suggestions on this regard please feel free to leave them here, or create a new issue in the SceneStats repository referencing this one.

Breakthrough avatar Aug 04 '19 04:08 Breakthrough

Maybe CuPy, which is the Numpy library running on GPU might help you @Breakthrough !

flavienbwk avatar Jul 03 '20 23:07 flavienbwk

@flavienbwk that most definitely is a game changer, thanks for letting me know! I'll keep this issue re-opened and in the backlog for the time being. I'd still like to pursue the SceneStats project by writing the core in C++, but I don't want to discount using CuPy as an option either (it definitely seems valuable if it can achieve the same goal!).

Breakthrough avatar Jul 04 '20 19:07 Breakthrough

https://github.com/sam09/shot-detector Something I wrote for my use case sometime back. It works with <= CUDA 7.0

sam09 avatar Jul 08 '20 10:07 sam09

@sam09 Tho it's no usable in Python, it may inspire @Breakthrough

flavienbwk avatar Jul 08 '20 10:07 flavienbwk

@flavienbwk Ah yes. It's in C++. In case anybody was looking for a GPU solution. :smiley:

sam09 avatar Jul 08 '20 11:07 sam09

As an update on this, there are no plans to do scene detection on GPU. There have been plenty of optimizations made since this issue was first released, so most use cases should be met currently. I pursued this for a similar project I have (DVR-Scan, which supports CUDA), but I don't think it's necessary for PySceneDetect.

Breakthrough avatar Jul 17 '23 00:07 Breakthrough