handwriting-ocr
handwriting-ocr copied to clipboard
Rework - IMPORTANT
Currently, the project is undergoing big reorganization. The new code is in rework
branch and once this issue is closed it will be merged with master
. What will be new:
- More logical structure
- Incorporating new much larger datasets
- Unifying naming and style of code
- Removing de-precedent code
- Dropping support of Czech accents recognition
This brings some breaking changes. I recommend moving to the new code because I will no longer fix the issues from the old versions.
Model retraining
With the new version, some old models may become incompatible. Also, the old models were trained only on a small dataset. This requires large retraining. I would appreciate any help with this task because I have only limited access to some computation clouds.
Dropping support of Czech accents
The Czech accents will be removed from the words. Keeping only some text files which allow recovery of them. This solves some compatibility issues with different OS. Also, models trained on this dataset weren't very accurate. However, as a school project, I will be creating software which automatically adds Czech accents to sentences. This is an only partial solution of the problem, but I don't have enough data for successful recognition of them anyways.
Some updates:
- I updated the ocr package
- I am finishing the dataset section with all the scripts. It should be big step up for the project, so please let me know if it works.
- I will continue with rework of the notebooks
- I will try to follow this guide for updating the project: https://guide.esciencecenter.nl/
- I will also try to automate as many task as possible.
- Update for TensorFlow 2.0
- Follow code style Black
Ideas for better propagation https://guide.esciencecenter.nl/best_practices/communication.html
- Web page
- Docker image
- online demo
- screencast
I am also thinking about adding tests and setting up some continuous integration like travis CI
Hi, I'm having trouble understanding the readme files. Any Youtube video that can explain how to get the datasets and creating the envs. Most of the packages are unavailable for installation.
Hi @SRK-returns,
which branch do you use? The update
or master
branch? I don't have any video instructions. It also depends on your OS.