CRAFT-pytorch Text detection takes longer time to return the results around 6-8 seconds even in GPU mode

Hi, I am using your pre trained model craft_mlt_25k.pth for text detection. i have modified the test.py as per my use case and it always process only one image for single call. In cpu mode it takes on average of 12 to 14 seconds to process the single image(480*640) and in GPU (google colab) it takes around 7 seconds.

Especially, the call to forward pass (y, _ = net(x)) to craft network takes longer time to return the propability tensor.

Is there any way that i can speed it up? Thanks in advance.

Sep 16 '19 12:09 kalai2033

I encounter the same issue - the inference pass through the network takes on average 6 seconds (on a CPU in my case).

More alarmingly - with each image the system memory usage is higher and higher, as if some tensors are getting added to the graph and not released with every call? Has anyone else encountered this issue?

Sep 19 '19 13:09 piernikowyludek

It seems the model used in the below link is faster... I think the model used here is different. It would be great if the provided pre trained model craft_mlt_25k.pth is as fast as this one.

https://demo.ocr.clova.ai

Sep 19 '19 14:09 kalai2033

have faced the same issue. after each images memory consumption is increasing. any solution yet?

Sep 20 '19 11:09 p9anand

In CPU mode, it takes a long time, but I have no idea why the inference speed is slow in GPU mode. I think there are so many factors such as GPU hardware, driver version, installed modules, and other environmental factors. Also, I was not aware of memory leakage of our model. I'll look into the memory leakage in both CPU and GPU modes. If anyone can find the problem, please let me know.

Sep 21 '19 13:09 YoungminBaek

Did you get the chance to have a look at why during inference time memory consumption is keeps on increasing if you pass multiple images one by one.

Sep 25 '19 06:09 p9anand

@p9anand @YoungminBaek I have some insight into the memory leakage. I initially installed torch version 1.2 (the newest) and encountered the error while running inference on cpu. The problem has disappeared for me after downgrading to pytorch version 0.4.1 (as listed in the requirements of this repo)! It might be a pytorch problem / incompatibility. If you have this problem - try other versions of the library :p

Sep 25 '19 16:09 piernikowyludek

@piernikowyludek : Thanks. It worked. Now there is no memory leakage on inference.

Sep 26 '19 11:09 p9anand

Still there is no change in time taken for forward pass . It still takes a long time even after downgrading to pytorch version 0.4.1 .

Sep 26 '19 11:09 kalai2033

In CPU mode, it takes a long time, but I have no idea why the inference speed is slow in GPU mode. I think there are so many factors such as GPU hardware, driver version, installed modules, and other environmental factors. Also, I was not aware of memory leakage of our model. I'll look into the memory leakage in both CPU and GPU modes. If anyone can find the problem, please let me know.

@YoungminBaek Hi, Were you able to look at the issue why it takes long time for text detection in cpu and gpu.

Oct 09 '19 13:10 kalai2033

I tested it with an image of 800 * 800. Each image takes about 0.8s. I located the line y[0,:,:,1].cpu().data.numpy() in test.py. . Is there any way to solve the problem of long test time? thank you very much.

Nov 19 '19 03:11 zcswdt

@kalai2033 Can you send me the model of the online demo you provided? Looking forward to your reply, thank you

Dec 10 '19 07:12 zcswdt

@YoungminBaek I downgraded pytorch to 0.4.1. but I still have the memory leak (in CPU mode). When I process the same image multiple times memory consumption seems stable, but increases rapidly when you start inferencing different images. Also inference takes around 8 seconds... Any progress with solving the memory leak?

Jan 08 '20 19:01 hermanius

I have solved the problem by loading the "vgg" pretrained model. If not loaded, it will pass params via http protocol ( it's too slow for web request)

Jan 16 '20 10:01 HanelYuki

@HanelYuki The problem of slow inference or the memory leak problem? Which pytorch version?

Jan 16 '20 10:01 hermanius

@hermanius slow inference

Jan 16 '20 11:01 HanelYuki

@HanelYuki I'v downloaded None-VGG-BiLSTM-CTC.pth model and used it to replace craft_mlt_25k.pth as --trained_model to run test.py, but I get errors. Obviously I'm missing something. Can you help me out?

Loading weights from checkpoint (None-VGG-BiLSTM-CTC.pth) Traceback (most recent call last): File "mmi_craft.py", line 145, in net.load_state_dict(copyStateDict(torch.load(args.trained_model, map_location='cpu'))) File "/root/miniconda/envs/craft/lib/python3.6/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for CRAFT: Missing key(s) in state_dict: "basenet.slice1.0.weight", "basenet.slice1.0.bias", "basenet.slice1.1.weight", "basenet.slice1.1.bias", "basenet.slice1.1.running_mean", "basenet.slice1.1.running_var", "basenet.slice1.3..........

Jan 17 '20 14:01 hermanius

@hermanius You can run craft.py with 'predicted = true' at line 8,, and then you will find some helpful errors

Jan 19 '20 06:01 HanelYuki

@piernikowyludek @p9anand I've downgraded as you stated, the problem of memory leak disappeared, but now I've very long testing time which is over than 30 seconds!! I can't discover the reason!! I'm on python 3.7 with same environmental requirements in the requirements.txt file

Jan 30 '20 07:01 ShroukMansour

@zcswdt What your environments libraries version and your python version? cpu or gpu? It takes over than 30s with me!

Jan 30 '20 07:01 ShroukMansour

predicted

i tried your answer its not correct (sabra) D:\CRAFT-pytorch-master>py craft.py Traceback (most recent call last): File "craft.py", line 83, in model = CRAFT(predicted=True).cuda() TypeError: init() got an unexpected keyword argument 'predicted'

@HanelYuki

Nov 20 '20 14:11 SabraHashemi

@hermanius
sorry but craft_mlt_25k.pth it for detect word in scene and None-VGG-BiLSTM-CTC.pth it for recognition word. on diferent functions

https://github.com/clovaai/deep-text-recognition-benchmark

Dec 10 '20 15:12 waflessnet

@waflessnet Thanks for the info. I use craft for text detection now, and tesseract for OCR. When I have the time I will try vgg again for OCR.

Dec 10 '20 17:12 hermanius

@hermanius Did you happen to solve the problem ? if you can share any thoughts. I just started to use this repo, running into similar problems as stated above. Thanks!

Apr 29 '21 00:04 mohammedayub44

I got the same issue for time with memory consumption is increasing. Anyone have solution for this problem ?

May 17 '21 12:05 manhcuogntin4

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

Aug 03 '21 05:08 SaeedArisha

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

Aug 03 '21 15:08 SabraHashemi

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

Aug 03 '21 15:08 SaeedArisha

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)

Aug 04 '21 13:08 SabraHashemi

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)

I don't have a local machine that has GPU and it's taking 6s on colab as well, I really need help. Can you guide me on which machine I should use?

Aug 04 '21 15:08 SaeedArisha

Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?

are you using laptop?

No, amazon's ec2 instance (g4dn.xlarge)

anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)

I don't have a local machine that has GPU and it's taking 6s on colab as well, I really need help. Can you guide me on which machine I should use?

i ran on these gpues: nvidia gtx 1050 , 980 ti, 780ti and mx130 exe time is increasing in above order. i suggest to use 980ti expected exe time is about 40ms -100ms

Aug 04 '21 18:08 SabraHashemi

CRAFT-pytorch CRAFT-pytorch copied to clipboard

Text detection takes longer time to return the results around 6-8 seconds even in GPU mode

CRAFT-pytorch
CRAFT-pytorch copied to clipboard