CRAFT-pytorch
CRAFT-pytorch copied to clipboard
Text detection takes longer time to return the results around 6-8 seconds even in GPU mode
Hi, I am using your pre trained model craft_mlt_25k.pth for text detection. i have modified the test.py as per my use case and it always process only one image for single call. In cpu mode it takes on average of 12 to 14 seconds to process the single image(480*640) and in GPU (google colab) it takes around 7 seconds.
Especially, the call to forward pass (y, _ = net(x)) to craft network takes longer time to return the propability tensor.
Is there any way that i can speed it up? Thanks in advance.
I encounter the same issue - the inference pass through the network takes on average 6 seconds (on a CPU in my case).
More alarmingly - with each image the system memory usage is higher and higher, as if some tensors are getting added to the graph and not released with every call? Has anyone else encountered this issue?
It seems the model used in the below link is faster... I think the model used here is different. It would be great if the provided pre trained model craft_mlt_25k.pth is as fast as this one.
https://demo.ocr.clova.ai
have faced the same issue. after each images memory consumption is increasing. any solution yet?
In CPU mode, it takes a long time, but I have no idea why the inference speed is slow in GPU mode. I think there are so many factors such as GPU hardware, driver version, installed modules, and other environmental factors. Also, I was not aware of memory leakage of our model. I'll look into the memory leakage in both CPU and GPU modes. If anyone can find the problem, please let me know.
Did you get the chance to have a look at why during inference time memory consumption is keeps on increasing if you pass multiple images one by one.
@p9anand @YoungminBaek I have some insight into the memory leakage. I initially installed torch version 1.2 (the newest) and encountered the error while running inference on cpu. The problem has disappeared for me after downgrading to pytorch version 0.4.1 (as listed in the requirements of this repo)! It might be a pytorch problem / incompatibility. If you have this problem - try other versions of the library :p
@piernikowyludek : Thanks. It worked. Now there is no memory leakage on inference.
Still there is no change in time taken for forward pass . It still takes a long time even after downgrading to pytorch version 0.4.1 .
In CPU mode, it takes a long time, but I have no idea why the inference speed is slow in GPU mode. I think there are so many factors such as GPU hardware, driver version, installed modules, and other environmental factors. Also, I was not aware of memory leakage of our model. I'll look into the memory leakage in both CPU and GPU modes. If anyone can find the problem, please let me know.
@YoungminBaek Hi, Were you able to look at the issue why it takes long time for text detection in cpu and gpu.
I tested it with an image of 800 * 800. Each image takes about 0.8s. I located the line y[0,:,:,1].cpu().data.numpy() in test.py. . Is there any way to solve the problem of long test time? thank you very much.
@kalai2033 Can you send me the model of the online demo you provided? Looking forward to your reply, thank you
@YoungminBaek I downgraded pytorch to 0.4.1. but I still have the memory leak (in CPU mode). When I process the same image multiple times memory consumption seems stable, but increases rapidly when you start inferencing different images. Also inference takes around 8 seconds... Any progress with solving the memory leak?
I have solved the problem by loading the "vgg" pretrained model. If not loaded, it will pass params via http protocol ( it's too slow for web request)
@HanelYuki The problem of slow inference or the memory leak problem? Which pytorch version?
@hermanius slow inference
@HanelYuki I'v downloaded None-VGG-BiLSTM-CTC.pth model and used it to replace craft_mlt_25k.pth as --trained_model to run test.py, but I get errors. Obviously I'm missing something. Can you help me out?
Loading weights from checkpoint (None-VGG-BiLSTM-CTC.pth)
Traceback (most recent call last):
File "mmi_craft.py", line 145, in
@hermanius You can run craft.py with 'predicted = true' at line 8,, and then you will find some helpful errors
@piernikowyludek @p9anand I've downgraded as you stated, the problem of memory leak disappeared, but now I've very long testing time which is over than 30 seconds!! I can't discover the reason!! I'm on python 3.7 with same environmental requirements in the requirements.txt file
@zcswdt What your environments libraries version and your python version? cpu or gpu? It takes over than 30s with me!
predicted
i tried your answer its not correct
(sabra) D:\CRAFT-pytorch-master>py craft.py
Traceback (most recent call last):
File "craft.py", line 83, in
@HanelYuki
@hermanius
sorry but craft_mlt_25k.pth it for detect word in scene and None-VGG-BiLSTM-CTC.pth it for recognition word. on diferent functions
https://github.com/clovaai/deep-text-recognition-benchmark
@waflessnet Thanks for the info. I use craft for text detection now, and tesseract for OCR. When I have the time I will try vgg again for OCR.
@hermanius Did you happen to solve the problem ? if you can share any thoughts. I just started to use this repo, running into similar problems as stated above. Thanks!
I got the same issue for time with memory consumption is increasing. Anyone have solution for this problem ?
Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?
Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?
are you using laptop?
Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?
are you using laptop?
No, amazon's ec2 instance (g4dn.xlarge)
Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?
are you using laptop?
No, amazon's ec2 instance (g4dn.xlarge)
anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)
Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?
are you using laptop?
No, amazon's ec2 instance (g4dn.xlarge)
anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)
I don't have a local machine that has GPU and it's taking 6s on colab as well, I really need help. Can you guide me on which machine I should use?
Was anyone able to find a concrete answer to this problem? I have the exact same issue (I have a single core GPU and the craft detection model takes around 6-8s or more sometimes) but based on some tests I'm running what I see is that if I set the width to 1280 before passing the image the model the detection speed is fast, the inference speed reduces down to 3s. I'm happy but confused about the reliability of this approach. Anyone up for discussion?
are you using laptop?
No, amazon's ec2 instance (g4dn.xlarge)
anyway 6 to 8 s or even 3 second for an image is too much , try another system.(local)
I don't have a local machine that has GPU and it's taking 6s on colab as well, I really need help. Can you guide me on which machine I should use?
i ran on these gpues: nvidia gtx 1050 , 980 ti, 780ti and mx130 exe time is increasing in above order. i suggest to use 980ti expected exe time is about 40ms -100ms