ScanSSD
ScanSSD copied to clipboard
How could I test my own image to detect formula?
I try to test my own image, but I don't know how to do it. I try this code: python3 test.py --dataset_root ./ --trained_model AMATH512_e1GTDB.pth --visual_threshold 0.25 --cuda True --exp_name test_MATH512 --test_data ./ --suffix "_512" --model_type 512 --cfg hboxes512 --padding 0 2 --kernel 1 5 --batch_size 8
If I want to test an image, what should I put in dataset_root and test_data ?
Maybe this could be helpful. It's running on cpu only though:
https://github.com/jjdredd/ScanSSD/blob/lowmem/detect.py
@jjdredd thanks a lot!!! I success the test. But I found some code should be modified.
b[0] = int(o_box[0] * images.shape[0]) ==> b[0] = int(o_box[0] * images.shape[1]) b[1] = int(o_box[1] * images.shape[1]) ==> b[1] = int(o_box[1] * images.shape[0]) b[2] = int(o_box[2] * images.shape[0]) ==> b[2] = int(o_box[2] * images.shape[1]) b[3] = int(o_box[3] * images.shape[1]) ==> b[3] = int(o_box[3] * images.shape[0])
And the draw_box will be correct.
Maybe this could be helpful. It's running on cpu only though:
https://github.com/jjdredd/ScanSSD/blob/lowmem/detect.py
In function _img_to_tensor
I noticed that the image is being resized to (512,512)
But the paper states that a rolling window of 512 x 512 is being passed over the document image to prevent resizing. Won;t resizing the image affect results? As most of the document images aren't in 1:1 aspect ratio.
It's just a quick test script. Usually you would divide the image into parts, detect in each subimage and stich the bounding boxes IIRC.
@jjdredd thanks a lot!!! I success the test. But I found some code should be modified.
b[0] = int(o_box[0] * images.shape[0]) ==> b[0] = int(o_box[0] * images.shape[1]) b[1] = int(o_box[1] * images.shape[1]) ==> b[1] = int(o_box[1] * images.shape[0]) b[2] = int(o_box[2] * images.shape[0]) ==> b[2] = int(o_box[2] * images.shape[1]) b[3] = int(o_box[3] * images.shape[1]) ==> b[3] = int(o_box[3] * images.shape[0])
And the draw_box will be correct.
please can you explain me how you changed the code detect.py to detect formulas on your own images ?
hello! I ve tried to run ' detect.py' in conda virt env in which i ve installed pytorch cpu version. It shows me the following error:
ScanSSD-master>python detect.py Traceback (most recent call last): File "detect.py", line 156, in
help me please !!
Hey @jjdredd , thank you so much for your code! I am trying to use it and got an error.
Python 3.8 Cuda 11.7 Torch 1.10.0+cu111, Torchvision 0.11.0, Torchaudio 0.10.0
Traceback (most recent call last):
File "detect.py", line 163, in <module>
b, s = md.DetectAny(0.2, a)
File "detect.py", line 147, in DetectAny
boxes, scores = self.Detect(thres, t)
File "detect.py", line 114, in Detect
y, debug_boxes, debug_scores = self._net(images) # forward pass
File "/home/$user/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/$user/Repos/ScanSSD/ssd/ssd.py", line 108, in forward
output, boxes, scores = self.detect(
File "/home/$user/.local/lib/python3.8/site-packages/torch/autograd/function.py", line 261, in __call__
raise RuntimeError(
RuntimeError: Legacy autograd function with non-static forward method is deprecated.
Please use new-style autograd function with static forward method.
(Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)
Of course I have check out the link below, but I didn't know what change. Can you or anybody else help me out ?
I made this Colab for custom image detection https://colab.research.google.com/drive/1cE_Sv4TaFONAHuQxr-_Da2PbAceNniWW
Not very elegant, but it works. It does a rolling window through the image and fixes de boundary boxes to the complete image so you can use images of any size.