transformers
                                
                                 transformers copied to clipboard
                                
                                    transformers copied to clipboard
                            
                            
                            
                        Add support for GOT-OCR2.0
Model description
As an OCR-2.0 model, GOT can handle all artificial optical signals (e.g., plain texts, math/molecular formulas, tables, charts, sheet music, and even geometric shapes) under various OCR tasks. On the input side, the model supports commonly used scene- and document-style images in slice and whole-page styles. On the output side, GOT can generate plain or formatted results (markdown/tikz/smiles/kern) via an easy prompt. Besides, the model enjoys interactive OCR features, i.e., region-level recognition guided by coordinates or colors.
Open source status
- [X] The model implementation is available
- [X] The model weights are available
Provide useful links for the implementation
Implementation: https://github.com/Ucas-HaoranWei/GOT-OCR2.0/ Paper: https://arxiv.org/abs/2409.01704