ocrs
ocrs copied to clipboard
Roadmap for 2024
This issue exists to document what I think are the highest priorities in the short-medium term.
Models and training:
- [x] Document how to repeat the model training process (https://github.com/robertknight/ocrs-models/issues/6). A repeatable training process is IMO required for an ML project to really be considered open source. Also this is needed to enable fine tuning or training for new languages
- [ ] Add benchmarks so the accuracy can be tracked over time
- [ ] Expand the training data sets for detection and recognition to improve accuracy
ocrs library and CLI tool:
- [ ] Add the infrastructure to support multiple languages and model updates (https://github.com/robertknight/ocrs/issues/8, https://github.com/robertknight/ocrs/issues/4)
- [x] Add end-to-end tests that actually check the output. There is a simple end-to-end test but it only verifies that the CLI tool can be built and runs, not the actual output (https://github.com/robertknight/ocrs/pull/25)
- [x] Improve runtime performance and efficiency
Beyond the short term list, here are some themes for subsequent work:
- Continue expanding the datasets and test cases to improve accuracy
- Use machine learning for layout analysis
- Quantize the models to 8-bit to make the downloads smaller and execution faster
- Improve WebAssembly execution performance
- Add bindings for other languages (eg. C, Python, Node)
And some longer term things:
- Support GPU inference. This will probably involve making the execution engine pluggable.