RealBasicVSR
RealBasicVSR copied to clipboard
GPU memory issue
Hi,
Thanks for sharing this code.
I tried to use my own sample video (mp4), but I've got a GPU memory issue. is there any restriction on the input file format or length?
This is the code I used to test
python inference_realbasicvsr.py configs/realbasicvsr_x4.py checkpoints/RealBasicVSR_x4.pth data/test.mp4 results/demo_001.mp4 --fps=30 --max_seq_len=20
How long is your video, and what is the resolution of the video?
Hi!! Video was 1080p and 1.5h long!
Thanks
Get Outlook for iOShttps://aka.ms/o0ukef
From: Kelvin C.K. Chan @.> Sent: Thursday, May 19, 2022 4:51:19 AM To: ckkelvinchan/RealBasicVSR @.> Cc: Jaehyun Shin @.>; Author @.> Subject: Re: [ckkelvinchan/RealBasicVSR] GPU memory issue (Issue #49)
[External Email]
How long is your video, and what is the resolution of the video?
— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fckkelvinchan%2FRealBasicVSR%2Fissues%2F49%23issuecomment-1130387441&data=05%7C01%7Cjaehyun.shin%40akqa.com%7C9f9df6d79bad43c7da9108da38ff6793%7C957cceeb98294dd9a6808ca4d2fe58f0%7C0%7C0%7C637884966847390416%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=06mmUdK91liJ%2FyJjT0dR0YoY6DFnghQRyJyFrxnnu0M%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAKMMHRJ2RLV2ZAFVAKH3KRLVKU32PANCNFSM5VOAOX6Q&data=05%7C01%7Cjaehyun.shin%40akqa.com%7C9f9df6d79bad43c7da9108da38ff6793%7C957cceeb98294dd9a6808ca4d2fe58f0%7C0%7C0%7C637884966847390416%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JoVNIfHfZ%2BCyDJtBaHnKZ%2BYwg5cfJxcrVamWFvT8DQ8%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>
Privileged/Confidential Information may be contained in this message. If you are not the addressee indicated in this message (or responsible for delivery of the message to such person), you may not copy or deliver this message to anyone. In such case, you should destroy this message and kindly notify the sender by reply email. Please advise immediately if you or your employer does not consent to email for messages of this kind. Opinions, conclusions, and other information in this message that do not relate to the official business of AKQA, shall be understood as neither given nor endorsed by it.
Same issues, first video is 4 secend video with 1080*1920, then I try the data/demo_001.mp4
but it also failed
RuntimeError: CUDA out of memory. Tried to allocate 4.35 GiB (GPU 0; 8.00 GiB total capacity; 4.37 GiB already allocated; 0 bytes free; 6.06 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
That means GPU 8G isn't enough to run this? Because I restart computer and check GPU available memory but it doesn't work
@jaehyunshinML Since it is 1.5h long, the number of frames is huge. Therefore the network is unable to handle such a long video. You can set max-seq-len
to a smaller number to see, as in here
@QiFuChina In this case, I guess 8G is not enough for that. Did you try reducing max-see-len
as mentioned above?
@QiFuChina In this case, I guess 8G is not enough for that. Did you try reducing
max-see-len
as mentioned above?
Thanks, after change max-see-len
to enough value(from 24 to 1), I can run the program with demo_001.mp4 and it works. Then I want to collect result about different videos length so I edit another video that 5min, 960*540, 158MB max-see-len
=1 to run but then I get this error
RuntimeError: [enforce fail at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 99532800 bytes.
then I change to max-see-len=20
and result is
DefaultCPUAllocator: not enough memory: you tried to allocate 55931212800 bytes.
My device info below:
OS win 10
CPU Intel i9-10900k
Memory 64G
GPU RTX 3070ti 8G
No CUDA error but memory error so I'm confused with these error
I think storing images of 5 mins video may not be enough even when in RAM. Since max-seq-len=1
is used. You can save them as png separately, and convert to mp4
later.
@ckkelvinchan hey, ckkelvinchan.
I've refactored the structure of inference_realbasicvsr.py for memory issues and committed to my fork, hope you read and test them and consider merging into your repo.
-
In _fixMem, I set --max_seq_len as the limit for loading images into RAM, and added the --split parameter to split the image for processing to solve the problem of insufficient GPU memory, and the introduction of multi-threading makes it work more tightly.
-
_withRich adds a progress bar base on _fixMem, which is helpful when dealing with large numbers of images.
If you decide to accept my contribution please reply or email me, thanks.
@ckkelvinchan hey, ckkelvinchan.
I've refactored the structure of inference_realbasicvsr.py for memory issues and committed to my fork...
@JiayunLi-3E8 does it output exactly the same frames as the original code with your modifications ? I mean, if there is motion and something moves from one part of the image into another part of the image, the model will not "see" it. Example if the image sequence is split into 2 halves and processed independently and license plate moves from one half to the other.
@JiayunLi-3E8 does it output exactly the same frames as the original code with your modifications ? I mean, if there is motion and something moves from one part of the image into another part of the image, the model will not "see" it if the image sequence is split into 2 halves and processed independently.
@ckkelvinchan After my tests, the images processed by the -split function do have some stitching traces, but I didn't notice it until I saw your question.
Here is my test file, where the output file is transcoded by ffmpeg for smaller size: https://wwc.lanzoub.com/b03pacymh password: e8gy
This is an optional parameter, and splitting is not done by default, but it may be an chance for low performance graphics cards.
@JiayunLi-3E8 I see - you have split it in 3 bands. Trying to optimize is good thing. 3070 is not low performance, it is low memory. I see another issue though - in your output frame at s 38.000 frame 2280 the nose is deformed. With original code I run out of RAM (didn't check why) so I'll try to process with my card with your code, same settings with split=1 to see whether it occurs. For now it works, occupies 19758MB Video RAM, 3.6 G CPU RAM, so you really enabled me to run it. The code uses several NN, if for example the optical flow is computed outside of GPU, the video memory usage can be reduced. I am not sure how fast the optical flow can compute on CPU though. It might become a bottleneck Edit: I got the same defect with the nose. Tried to run BasicVSR++ on input video, but it consumes even more cuda memory. I need 48GB GPU LoL!
@JiayunLi-3E8 I see - you have split it in 3 bands. Trying to optimize is good thing. 3070 is not low performance, it is low memory. I see another issue though - in your output frame at s 38.000 frame 2280 the nose is deformed. With original code I run out of RAM (didn't check why) so I'll try to process with my card with your code, same settings with split=1 to see whether it occurs. For now it works, occupies 19758MB Video RAM, 3.6 G CPU RAM, so you really enabled me to run it. The code uses several NN, if for example the optical flow is computed outside of GPU, the video memory usage can be reduced. I am not sure how fast the optical flow can compute on CPU though. It might become a bottleneck Edit: I got the same defect with the nose. Tried to run BasicVSR++ on input video, but it consumes even more cuda memory. I need 48GB GPU LoL!
I see, maybe the problem about nose is in the model. The -max_seq_len affects both RAM and GPU RAM, but -split affects GPU RAM only.
Hello,I have the same problem, my video file is 480*320 and video length is 4s, CUDA Out of Memory still appears.and then I set max_seq_len=minVal , it still out of memory.but the demo video file can success running. @ckkelvinchan @jaehyunshinML
作者的demo脚本不适合处理较长视频,建议处理为单帧推理。