videocr-PaddleOCR Docker: gpu is not detected

Docker: gpu is not detected

Open jalpianissimo opened this issue 7 months ago • 1 comments

Hi, I've wanted to use your program with docker on Linux but I had the following problems:

Issue 1:

Following PaddleOCR environment setup instructions I used the paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 image, then I installed paddlepaddle-gpu, paddleocr and videocr-PaddleOCR. So I tried to run videocr like the Colab example, getting the following error:

ImportError: cannot import name 'shadow_var_between_sub_programs' from 'paddle.distributed.passes.pass_utils'

I fixed it by downloading the latest pass_utils.py from PaddleOCR repo, this way everything worked.

Issue 2:

Afterward I noticed that the OCR was performed via CPU, as nvidia-smi did not show active GPU usage and running the following returned False:

import paddle
gpu_available  = paddle.device.is_compiled_with_cuda()
print("GPU available:", gpu_available)

As I already installed the Nvidia Container Toolkit I tried to start a new container with the same image, to understand if the problem was within the image or from something else. So I did:

docker stop ppocr
docker container remove ppocr
docker image -a
docker image rm [ID]
sudo docker run --gpus all --name ppocr -v $PWD:/paddle --shm-size=64G --network=host -it paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 /bin/bash

I immediately tried to check for GPU usage with the python snippet above and it returned True. Next I installed videocr-PaddleOCR and checked again, this time it returned False. Then I tried installing videocr-PaddleOCR on a newer docker image pulled from the Hub (paddlepaddle/paddle:2.6.1-gpu-cuda12.0-cudnn8.9-trt8.6) and repeated the steps above, so I checked GPU after starting the container, after installing paddlepaddle-gpu, and after installing videocr, having the same results as before (but this time no ImportError) --> so running paddleocr alone works on GPU, after installing videocr it does not anymore...

As I do understand very little of everything programming-related, my solution (in order to have videocr on gpu) is as follows:

Start a docker container with paddlepaddle/paddle:2.6.1-gpu-cuda12.0-cudnn8.9-trt8.6 as image
Clone this repo git clone https://github.com/devmaxxing/videocr-PaddleOCR and edit the requirements.txt file so that only includes:

paddlepaddle-gpu
paddleocr==2.7.0.2
charset-normalizer==3.2.0
colorama==0.4.6
Levenshtein==0.21.1
paddle-bfloat==0.1.7
python-Levenshtein==0.21.1
PyWavelets==1.4.1
thefuzz==0.19.0

Install python -m pip install . and run! Now it uses gpu (as running the first snippet returns True even after installing videocr).

Note:

I reached this conclusion by chance, but if I have to give a reasoning behind is the presence of paddlepaddle in the original requirements.txt file, as I noted paddlepaddle-gpu and paddlepaddle together return gpu usage as False regardless of docker image. To be sure I only included all the missing dependencies: (i.e. on the clean docker image i installed paddlepaddle-gpu, then pip freeze and cross-checked to get everything that was missing from the original requirements.txt), and added paddleocr=2.7.0.2 due to #16 as using the latest paddleocr I encountered the same issue.

Note-bis:

Here is the code I used (as taken from Colab) to test videocr

from videocr import save_subtitles_to_file

#@title OCR parameters
input_file_path = "/home/test.mp4" 
output_file_path = "/home/out.srt" 
language_code = "ch" 
use_gpu = True 
start_time = "0:00" 
end_time = "" 
confidence_threshold = 75 
similarity_threshold = 80 
frames_to_skip = 0 
crop_x = None 
crop_y = None 
crop_width = None 
crop_height = None 

save_subtitles_to_file(input_file_path, output_file_path, lang=language_code,
                       time_start=start_time, time_end=end_time,
                       conf_threshold=confidence_threshold, sim_threshold=similarity_threshold,
                       use_gpu=use_gpu,
                       frames_to_skip=frames_to_skip,
                       crop_x=crop_x, crop_y=crop_y, crop_width=crop_width, crop_height=crop_height)

Note-last:

I wanted to thank you for this program as it helped me a lot, I wanted to share my experience as I lost some good few hours but now seem to be fixed.

Jul 01 '24 23:07 jalpianissimo

videocr-PaddleOCR videocr-PaddleOCR copied to clipboard

Docker: gpu is not detected

Issue 1:

Issue 2:

Note:

Note-bis:

Note-last:

videocr-PaddleOCR
videocr-PaddleOCR copied to clipboard