Robert Sachunsky comments

Results 735 comments of


                                            Robert Sachunsky

trafficstars

output true CER for checkpoints (at least the final one)

> Should I move this issue to the `tesseract` repository? Please do!

output true CER for checkpoints (at least the final one)

@wollmers > For avoidance of doubt: CER is usually computed based on Levenshtein distance in the narrow sense, which means with allowed edit operations of insert, delete and substitute each...

output true CER for checkpoints (at least the final one)

No one seems to care that their trained models are selected from suboptimal checkpoints and reported with way-too-optimistic error rates? As long as lstmtraining and lstmeval report figures the way...

output true CER for checkpoints (at least the final one)

@stweil > the current models are not so bad, they could be even better if an appropriate checkpoint was selected – which a huge waste > it is already possible...

output true CER for checkpoints (at least the final one)

> [Levenshtein language:c++ stars:>50 -license:gpl](https://github.com/search?q=Levenshtein+language%3Ac%2B%2B+stars%3A%3E50+-license%3Agpl) > > Edit: Indeed, RapidFuzz seems to be the best option. Not tagged with that topic, but https://github.com/seqan/seqan3 could be an option, too. @wollmers, agreeing...

output true CER for checkpoints (at least the final one)

> So basically we can split this into two separate tasks: > > 1. Make Tesseract use the eval set to select checkpoints and to determine when to finish training....

output true CER for checkpoints (at least the final one)

@stweil, what you describe are external means, though. But the question raised by @amitdo was whether there might be some script solution to address the CER/BCER calculation problem from _within_...

Arrow color to text color?

Note: you can set the color of the arrows according to their respective labels ex post by: - keeping track of the artists added to `ax.texts` after `adjust_text`, and then...

Arrow color to text color?

@edlemus what do you mean `adjust_text` does not have an attribute? It's just a function on `matplotlib.text.Text` instances. See https://github.com/i008/COCO-dataset-explorer/blob/df82a833163d7acc131eefb5264684cb2cc627b5/vis.py#L116-L119 for example.

Can I load an image from bytes?

That cannot work. Tesseract's image datastructure `pix` (from Leptonica) needs to _know_ what format the input is in. Either it's a full byte stream of some standard image format (recognizable...