EasyOCR Easyocr memory leak

Everytime when I call the reader.readertxt function memory will be increasing in both GPU and CPU.Need some inputs from anyone

Aug 08 '22 10:08 arpithajanney

once in a while you can do gc.collect() for CPU memory cleanup torch.cuda.empty_cache() for GPU memory cleanup

Aug 09 '22 11:08 rkcosmos

We tried with both the commands but this is not working...In the GPU also memory leak will happen? Any other methods do you suggest like downgrading versions or any other alternatives

Aug 09 '22 11:08 arpithajanney

Check to make sure you're not using different sized images as input. Since the model is fully convolutional, it tries to make spaces for as large of an input as possible. See the image where the memory leak occurs and make sure it's not too large.

Aug 09 '22 20:08 macksjeremy

Image is not too large , I tried uploading the same image for multiple times in that case also memory is increasing and also i have checked memory footprint, reader.readertext() function is occupying memory in each call

Aug 10 '22 05:08 arpithajanney

I observed the same behavior, ie. calling reader.readertxt function memory will be increasing in CPU. Adding gc.collect() does not help. I've not checked with GPU so far.

Aug 10 '22 09:08 BBO-repo

any advice on this issue? I already tried gc.collect() with no luck. CPU only , never tried GPU

Sep 11 '22 08:09 BasilDavid

Hi, any news on this ? that's a very big issue.. I observe memory leak in CPU. Even using gc.collect() before every call to reader.readertxt after a while you go OOM. Restarting the script is not an option if you are deploying an API. Must be another option to free memory @rkcosmos .

Nov 10 '22 17:11 rokopi-byte

Observing this issue in CPU as well. Memory keeps increasing until it reaches limit in k8s and the service gets Killed.

Jan 27 '23 11:01 idengaurav

I’m facing the same problem. Any updates on this ? i tried to find the mem leak by using memory_profiler in my python program:

`

    Line #    Mem usage    Increment  Occurrences   Line Contents
    =============================================================
       308    962.8 MiB    962.8 MiB           1       @profile
       309                                             def detect(self, img, min_size = 20, text_threshold = 0.7, low_text = 0.4,\
       310                                                        link_threshold = 0.4,canvas_size = 2560, mag_ratio = 1.,\
       311                                                        slope_ths = 0.1, ycenter_ths = 0.5, height_ths = 0.5,\
       312                                                        width_ths = 0.5, add_margin = 0.1, reformat=True, optimal_num_chars=None,
       313                                                        threshold = 0.2, bbox_min_score = 0.2, bbox_min_size = 3, max_candidates = 0,
       314                                                        ):
       315                                         
       316    962.8 MiB      0.0 MiB           1           if reformat:
       317                                                     img, img_cv_grey = reformat_input(img)
       318                                         
       319   2265.1 MiB   1302.3 MiB           2           text_box_list = self.get_textbox(self.detector, 
       320    962.8 MiB      0.0 MiB           1                                       img, 
       321    962.8 MiB      0.0 MiB           1                                       canvas_size = canvas_size, 
       322    962.8 MiB      0.0 MiB           1                                       mag_ratio = mag_ratio,
       323    962.8 MiB      0.0 MiB           1                                       text_threshold = text_threshold, 
       324    962.8 MiB      0.0 MiB           1                                       link_threshold = link_threshold, 
       325    962.8 MiB      0.0 MiB           1                                       low_text = low_text,
       326    962.8 MiB      0.0 MiB           1                                       poly = False, 
       327    962.8 MiB      0.0 MiB           1                                       device = self.device, 
       328    962.8 MiB      0.0 MiB           1                                       optimal_num_chars = optimal_num_chars,
       329    962.8 MiB      0.0 MiB           1                                       threshold = threshold, 
       330    962.8 MiB      0.0 MiB           1                                       bbox_min_score = bbox_min_score, 
       331    962.8 MiB      0.0 MiB           1                                       bbox_min_size = bbox_min_size, 
       332    962.8 MiB      0.0 MiB           1                                       max_candidates = max_candidates,
       333                                                                             )
       334                                         
       335   2265.1 MiB      0.0 MiB           1           horizontal_list_agg, free_list_agg = [], []
       336   2265.1 MiB      0.0 MiB           2           for text_box in text_box_list:
       337   2265.1 MiB      0.0 MiB           2               horizontal_list, free_list = group_text_box(text_box, slope_ths,
       338   2265.1 MiB      0.0 MiB           1                                                           ycenter_ths, height_ths,
       339   2265.1 MiB      0.0 MiB           1                                                           width_ths, add_margin,
       340   2265.1 MiB      0.0 MiB           1                                                           (optimal_num_chars is None))
       341   2265.1 MiB      0.0 MiB           1               if min_size:
       342   2265.1 MiB      0.0 MiB          21                   horizontal_list = [i for i in horizontal_list if max(
       343   2265.1 MiB      0.0 MiB          12                       i[1] - i[0], i[3] - i[2]) > min_size]
       344   2265.1 MiB      0.0 MiB           3                   free_list = [i for i in free_list if max(
       345                                                             diff([c[0] for c in i]), diff([c[1] for c in i])) > min_size]
       346   2265.1 MiB      0.0 MiB           1               horizontal_list_agg.append(horizontal_list)
       347   2265.1 MiB      0.0 MiB           1               free_list_agg.append(free_list)
       348                                         
       349   2265.1 MiB      0.0 MiB           1           return horizontal_list_agg, free_list_agg

`

when calling the .readtext() `

  Line #    Mem usage    Increment  Occurrences   Line Contents
  =============================================================
      20    949.0 MiB    949.0 MiB           1   @profile
      21                                         def get_image(sqs_body: SQSSuggestionBase):
      22    950.1 MiB      1.1 MiB           1       response = requests.get(sqs_body['previewUrl'])
      23    950.1 MiB      0.0 MiB           1       if response:
      24    950.1 MiB      0.0 MiB           1           if response.status_code == 200:
      25    950.2 MiB      0.2 MiB           1               img = read_image(response.content)
      26   2273.4 MiB   1323.2 MiB           1               textInImage = tiiModel.readtext(img, detail=1, paragraph=False, blocklist='=-+_()/!"$%&?`^§[]') #0,45,315, , text_threshold=0.60, width_ths=0.7,threshold=0.60,link_threshold=0.60) # , allowlist = '0123456789' , batch_size=5, decoder='greedy' ,min_size=50
      27                                                     #imgH, imgW, channels = img.shape
      29   2273.4 MiB      0.0 MiB           1               imgW, imgH = img.size
      ...

`

and it keep increasing every time the method is called.

May 15 '23 16:05 Manhal-Munir-Al-khayat

Is there any update on this? After extensive profiling also its hard to pinpoint the exact line where the issue is happening. Doing this in v1.6.2

Since tools like tracemalloc and objgraph are not able to show any leaks, it's possible this is happening in c++ side of things which are being called by pytorch or numpy wrappers

Jul 27 '23 05:07 ash2703

We could confirm this problem. In our case each call to "readertxt" adds 33mb leak.

Sep 15 '23 13:09 AndyWatterman

At the end the only way I found to solve the problem is to do everything in a multiprocessing process. Not a very clean solution, it adds a little overhead, but not other solution was available..

Sep 15 '23 13:09 rokopi-byte

For those like me who are only using a CPU and facing this issue, make sure you add the gpu=False flag. Memory leak was fixed for me after adding it.

reader = easyocr.Reader(['en'], gpu=False)

Mar 21 '24 00:03 davebelle85

EasyOCR EasyOCR copied to clipboard

Easyocr memory leak

EasyOCR
EasyOCR copied to clipboard