simple-ocr-opencv
simple-ocr-opencv copied to clipboard
Example program (example.py) crashing at first run under python 3.9 windows
C:\users\david\simple-ocr-opencv\simpleocr_init_.py:28: SyntaxWarning: "is" with a literal. Did you mean "=="?
elif sys.platform is "win32":
showing after BlurProcessor (waiting for input)
showing ContourSegmenter contours (waiting for input)
showing image after segmentation by RawContourSegmenter (waiting for input)
showing segments filtered by LargeFilter (waiting for input)
showing segments filtered by SmallFilter (waiting for input)
showing segments filtered by LargeAreaFilter (waiting for input)
showing segments filtered by ContainedFilter (waiting for input)
Traceback (most recent call last):
File "example.py", line 15, in
same, did you able to solve it?
Hello goncallop,
I managed to repair the cv2.line problem by PR #39. Nevertheless, the program runs into more problems under python 3.9: The next one is in ocr.py, line 30:
TypeError
only integer scalar arrays can be converted to a scalar index
File "[/home/rupert/Software/OpenCV/simple-ocr-opencv/simpleocr/ocr.py]()", line 30, in reconstruct_chars
result_string = "".join(map(unichr, classes))
File "[/home/rupert/Software/OpenCV/simple-ocr-opencv/simpleocr/ocr.py]()", line 76, in ocr
chars = reconstruct_chars(classes)
File "[/home/rupert/Software/OpenCV/simple-ocr-opencv/example.py]()", line 15, in <module>
test_chars, test_classes, test_segments = ocr.ocr(test_image, show_steps=True)
Pretty similar in example_grounding.py
TypeError
only integer scalar arrays can be converted to a scalar index
File "[/home/rupert/Software/OpenCV/simple-ocr-opencv/simpleocr/classification.py]()", line 31, in classes_from_numpy
classes = list(map(unichr, classes))
File "[/home/rupert/Software/OpenCV/simple-ocr-opencv/simpleocr/grounding.py]()", line 69, in ground
classes = classes_from_numpy(imagefile.ground.classes)
File "[/home/rupert/Software/OpenCV/simple-ocr-opencv/example_grounding.py]()", line 10, in <module>
grounder.ground(new_image, segments)
Probably all locations map(unichr, classes)
. Do you see a solution for these?
Best regards, Rupert
It's been a while, so I don't exactly remember what the numpy array types/shapes of "classes" are, but ultimately you need to convert it to a string or list of characters.
The code is still in python2.
unichr
is just chr
in python3.
The array conversion problem might have several causes.
print(type(classes)); print(classes.shape); print(classes.dtype)
will
give you valuable debug information.
I wonder if doing a classes.reshape(-1)
might help.
Thank you for picking this up. If you have time to contribute back your changes that would definitely be appreciated by a lot of people that are still downloading this despite its age :)
Hi everyone, you can change the code in ocr.py
into this:
def reconstruct_chars(classes):
classes = numpy.asarray(classes, dtype=int)
classes = classes - 48
result_string = ""
for cls in classes:
result_string += str(cls)
return result_string
And this is my output:
showing after BlurProcessor (waiting for input)
showing ContourSegmenter contours (waiting for input)
showing image after segmentation by RawContourSegmenter (waiting for input)
showing segments filtered by LargeFilter (waiting for input)
showing segments filtered by SmallFilter (waiting for input)
showing segments filtered by LargeAreaFilter (waiting for input)
showing segments filtered by ContainedFilter (waiting for input)
showing line starts and ends (waiting for input)
showing segments filtered by NearLineFilter (waiting for input)
accuracy: 1.0
OCRed text:
[3][1][4][1][5][9][2][6][5][3][5][8][9][7][9][3][2][3][8][4][6][2][6][4][3][3][8][3][2][7][9][5][0][2][8][8][4][1][9][7][1][6][9][3][9][9][3][7][5][1][0][5][8][2][0][9][7][4][9][4][4][5][9][2][3][0][7][8][1][6][4][0][6][2][8][6][2][0][8][9][9][8][6][2][8][0][3][4][8][2][5][3][4][2][1][1][7][0][6][7][9][8][2][1][4][8][0][8][6][5][1][3][2][8][2][3][0][6][6][4][7][0][9][3][8][4][4][6][0][9][5][5][0][5][8][2][2][3][1][7][2][5][3][5][9][4][0][8][1][2][8][4][8][1][1][1][7][4][5][0][2][8][4][1][0][2][7][0][1][9][3][8][5][2][1][1][0][5][5][5][9][6][4][4][6][2][2][9][4][8][9][5][4][9][3][0][3][8][1][9][6][4][4][2][8][8][1][0][9][7][5][6][6][5][9][3][3][4][4][6][1][2][8][4][7]
Hi @zhixuanli,
you can change the code in
ocr.py
Thank you! It works for me, too, but I don't know why: My knowledge of Python, CV and OCR is by far too limited...
I came to this program because I want to read out photos like these:
As far as I understand the processes I need to train and "ground" the picture. Obviously, test_grounding.py
calls more or other procedures than example.py
, because it throws an error (only the first?!?) for line 31 in classification.py
. Can you look into this issue as well and try to fix it?
Thanks a lot!
Rupert