EasyOCR
EasyOCR copied to clipboard
Easyocr memory leak
Everytime when I call the reader.readertxt function memory will be increasing in both GPU and CPU.Need some inputs from anyone
once in a while you can do
gc.collect()
for CPU memory cleanup
torch.cuda.empty_cache()
for GPU memory cleanup
We tried with both the commands but this is not working...In the GPU also memory leak will happen? Any other methods do you suggest like downgrading versions or any other alternatives
Check to make sure you're not using different sized images as input. Since the model is fully convolutional, it tries to make spaces for as large of an input as possible. See the image where the memory leak occurs and make sure it's not too large.
Image is not too large , I tried uploading the same image for multiple times in that case also memory is increasing and also i have checked memory footprint, reader.readertext() function is occupying memory in each call
I observed the same behavior, ie. calling reader.readertxt
function memory will be increasing in CPU.
Adding gc.collect()
does not help.
I've not checked with GPU so far.
any advice on this issue? I already tried gc.collect()
with no luck. CPU only , never tried GPU
Hi, any news on this ? that's a very big issue.. I observe memory leak in CPU. Even using gc.collect()
before every call to reader.readertxt
after a while you go OOM. Restarting the script is not an option if you are deploying an API. Must be another option to free memory @rkcosmos .
Observing this issue in CPU as well. Memory keeps increasing until it reaches limit in k8s and the service gets Killed.
I’m facing the same problem. Any updates on this ?
i tried to find the mem leak by using memory_profiler
in my python program:
`
Line # Mem usage Increment Occurrences Line Contents
=============================================================
308 962.8 MiB 962.8 MiB 1 @profile
309 def detect(self, img, min_size = 20, text_threshold = 0.7, low_text = 0.4,\
310 link_threshold = 0.4,canvas_size = 2560, mag_ratio = 1.,\
311 slope_ths = 0.1, ycenter_ths = 0.5, height_ths = 0.5,\
312 width_ths = 0.5, add_margin = 0.1, reformat=True, optimal_num_chars=None,
313 threshold = 0.2, bbox_min_score = 0.2, bbox_min_size = 3, max_candidates = 0,
314 ):
315
316 962.8 MiB 0.0 MiB 1 if reformat:
317 img, img_cv_grey = reformat_input(img)
318
319 2265.1 MiB 1302.3 MiB 2 text_box_list = self.get_textbox(self.detector,
320 962.8 MiB 0.0 MiB 1 img,
321 962.8 MiB 0.0 MiB 1 canvas_size = canvas_size,
322 962.8 MiB 0.0 MiB 1 mag_ratio = mag_ratio,
323 962.8 MiB 0.0 MiB 1 text_threshold = text_threshold,
324 962.8 MiB 0.0 MiB 1 link_threshold = link_threshold,
325 962.8 MiB 0.0 MiB 1 low_text = low_text,
326 962.8 MiB 0.0 MiB 1 poly = False,
327 962.8 MiB 0.0 MiB 1 device = self.device,
328 962.8 MiB 0.0 MiB 1 optimal_num_chars = optimal_num_chars,
329 962.8 MiB 0.0 MiB 1 threshold = threshold,
330 962.8 MiB 0.0 MiB 1 bbox_min_score = bbox_min_score,
331 962.8 MiB 0.0 MiB 1 bbox_min_size = bbox_min_size,
332 962.8 MiB 0.0 MiB 1 max_candidates = max_candidates,
333 )
334
335 2265.1 MiB 0.0 MiB 1 horizontal_list_agg, free_list_agg = [], []
336 2265.1 MiB 0.0 MiB 2 for text_box in text_box_list:
337 2265.1 MiB 0.0 MiB 2 horizontal_list, free_list = group_text_box(text_box, slope_ths,
338 2265.1 MiB 0.0 MiB 1 ycenter_ths, height_ths,
339 2265.1 MiB 0.0 MiB 1 width_ths, add_margin,
340 2265.1 MiB 0.0 MiB 1 (optimal_num_chars is None))
341 2265.1 MiB 0.0 MiB 1 if min_size:
342 2265.1 MiB 0.0 MiB 21 horizontal_list = [i for i in horizontal_list if max(
343 2265.1 MiB 0.0 MiB 12 i[1] - i[0], i[3] - i[2]) > min_size]
344 2265.1 MiB 0.0 MiB 3 free_list = [i for i in free_list if max(
345 diff([c[0] for c in i]), diff([c[1] for c in i])) > min_size]
346 2265.1 MiB 0.0 MiB 1 horizontal_list_agg.append(horizontal_list)
347 2265.1 MiB 0.0 MiB 1 free_list_agg.append(free_list)
348
349 2265.1 MiB 0.0 MiB 1 return horizontal_list_agg, free_list_agg
`
when calling the .readtext()
`
Line # Mem usage Increment Occurrences Line Contents
=============================================================
20 949.0 MiB 949.0 MiB 1 @profile
21 def get_image(sqs_body: SQSSuggestionBase):
22 950.1 MiB 1.1 MiB 1 response = requests.get(sqs_body['previewUrl'])
23 950.1 MiB 0.0 MiB 1 if response:
24 950.1 MiB 0.0 MiB 1 if response.status_code == 200:
25 950.2 MiB 0.2 MiB 1 img = read_image(response.content)
26 2273.4 MiB 1323.2 MiB 1 textInImage = tiiModel.readtext(img, detail=1, paragraph=False, blocklist='=-+_()/!"$%&?`^§[]') #0,45,315, , text_threshold=0.60, width_ths=0.7,threshold=0.60,link_threshold=0.60) # , allowlist = '0123456789' , batch_size=5, decoder='greedy' ,min_size=50
27 #imgH, imgW, channels = img.shape
29 2273.4 MiB 0.0 MiB 1 imgW, imgH = img.size
...
`
and it keep increasing every time the method is called.
Is there any update on this? After extensive profiling also its hard to pinpoint the exact line where the issue is happening. Doing this in v1.6.2
Since tools like tracemalloc and objgraph are not able to show any leaks, it's possible this is happening in c++ side of things which are being called by pytorch or numpy wrappers
We could confirm this problem. In our case each call to "readertxt" adds 33mb leak.
At the end the only way I found to solve the problem is to do everything in a multiprocessing process. Not a very clean solution, it adds a little overhead, but not other solution was available..
For those like me who are only using a CPU and facing this issue, make sure you add the gpu=False flag. Memory leak was fixed for me after adding it.
reader = easyocr.Reader(['en'], gpu=False)