PySceneDetect icon indicating copy to clipboard operation
PySceneDetect copied to clipboard

[Bug] memory leak

Open HuaZheLei opened this issue 1 year ago • 7 comments

Description: When I use a 'for' loop to cut a sequence of videos, the memory raise all the time.

Command:

for video_path in video_lists: video = open_video(video_path) scene_manager = SceneManager() scene_manager.add_detector(ContentDetector(threshold=27)) scene_manager.detect_scenes(video, show_progress=False)

Output:

Environment:

boto3 1.34.31 1.29.1
botocore 1.34.31 1.32.1
bzip2 1.0.8 1.0.8
ca-certificates 2023.12.12 2023.12.12
click 8.1.7 8.1.7
jmespath 1.0.1 1.0.1
libffi 3.4.4 3.4.4
ncurses 6.4 6.4
numpy 1.26.3 1.26.3
objgraph 3.6.0  
opencv-python 4.9.0.80  
openssl 3.0.12 3.0.12
pip 23.3.1 23.3.1
platformdirs 4.1.0 3.10.0
python 3.10.13 3.12.1
python-dateutil 2.8.2 2.8.2
readline 8.2 8.2
s3transfer 0.10.0 0.7.0
scenedetect 0.6.2  
setuptools 68.2.2 68.2.2
six 1.16.0 1.16.0
sqlite 3.41.2 3.41.2
tk 8.6.12 8.6.12
tqdm 4.66.1 4.65.0
tzdata 2023d 2023d
urllib3 2.0.7 2.1.0
wheel 0.41.2 0.41.2
xz 5.4.5 5.4.5
zlib 1.2.13 1.2.13
� **Media/Files:**

HuaZheLei avatar Feb 01 '24 09:02 HuaZheLei

How many videos are you processing roughly? Does it occur if all of the paths in video_lists are the same video? What OS/environment are you running this under, and how are you running the script?

Thank you.

Edit: I was able to verify on Windows x64 at least on my system there is no memory leak with the code pattern you outlined above, this may be something due to the environment.

Breakthrough avatar Feb 02 '24 02:02 Breakthrough

How many videos are you processing roughly? Does it occur if all of the paths in video_lists are the same video? What OS/environment are you running this under, and how are you running the script?

Thank you

Thanks for your reply.

  1. I use the script to process 30k videos.
  2. If all of the paths in video_lists are the same video, it still occurs.
  3. I find this situation on Ubuntu and MacOS.
  4. I just use 'python3 xxx.py' in a single thread.

HuaZheLei avatar Feb 02 '24 02:02 HuaZheLei

Could you install and try another backend like 'pyav' or 'moviepy'? e.g. run pip install av and open video with video = open_video(video_path, 'pyav')

PySceneDetect is a pure Python library and offloads all video processing to either OpenCV, PyAV, or MoviePy. The next steps would be to narrow down if the memory leak, so trying a different backend will help greatly with that. In the meantime I will setup a VM to test this with locally on Ubuntu.

Can you create and upload a small script that causes the issue? A video clip as well would be helpful to ensure we are testing the same things and I want to see if we can use the exact same script/video. Thanks.

Breakthrough avatar Feb 03 '24 01:02 Breakthrough

Using the same package versions as you outlined above, on Ubuntu 22.04, I'm running all 3 different backends in a loop the same way you described above. From what I can see memory usage for PySceneDetect+OpenCV and PySceneDetect+MoviePy are steady. As for PyAV, the memory does seem to climb at first, but it is eventually reclaimed and drops again. I don't think PySceneDetect and most of the backends in use have any memory leaks, but am happy to look further into this if you can provide a reproduction.

Breakthrough avatar Feb 04 '24 03:02 Breakthrough

I have previously had this issue with moviepy, but that was a couple years ago and I haven't run things in a loop like this since then. I fixed it by writing things out to file occasionally and invoking the del function on a bunch of variables every X number of iterations.

wjs018 avatar Feb 06 '24 18:02 wjs018

Using the same package versions as you outlined above, on Ubuntu 22.04, I'm running all 3 different backends in a loop the same way you described above. From what I can see memory usage for PySceneDetect+OpenCV and PySceneDetect+MoviePy are steady. As for PyAV, the memory does seem to climb at first, but it is eventually reclaimed and drops again. I don't think PySceneDetect and most of the backends in use have any memory leaks, but am happy to look further into this if you can provide a reproduction.

Hi, I find that the code uses cv2.VideoCapture to open a video, but without to release it. When I run a for loop, it may cause the memory raising?

HuaZheLei avatar Feb 07 '24 02:02 HuaZheLei

Hi, I find that the code uses cv2.VideoCapture to open a video, but without to release it. When I run a for loop, it may cause the memory raising?

The destructor of a VideoCapture object will call release() which applies to Python as well. That will happen automatically at the end of each loop iteration. Are you able to reproduce this with a different versions of OpenCV?

I recently came across https://github.com/bloomberg/memray which might be useful to find the cause. A memory flame graph would be really helpful to identify what part of code is using the memory.

Breakthrough avatar Feb 12 '24 02:02 Breakthrough

Hello, I recently wrote a synchronous stripping service, and I also have this problem. The free memory keeps decreasing.

babyta avatar Mar 11 '24 05:03 babyta

‘ scene_manager.detect_scenes ’ Immediately following the previous answer, the execution of splitting will become slower and slower.

babyta avatar Mar 12 '24 01:03 babyta