GOT-OCR 2.0: Transformers 🤗 implementation demo
This demo utilizes the Transformers implementation of GOT-OCR 2.0 to extract text from images. The GOT-OCR 2.0 model was introduced in the paper: General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model by Haoran Wei, Chenglong Liu, Jinyue Chen, Jia Wang, Lingyu Kong, Yanming Xu, Zheng Ge, Liang Zhao, Jianjian Sun, Yuang Peng, Chunrui Han, and Xiangyu Zhang.
Key Features
GOT-OCR 2.0 is a state-of-the-art OCR model designed to handle a wide variety of tasks, including:
- Plain Text OCR
- Formatted Text OCR
- Fine-grained OCR
- Multi-crop OCR
- Multi-page OCR
Beyond Text
GOT-OCR 2.0 has also been fine-tuned to work with non-textual data, such as:
- Charts and Tables
- Math and Molecular Formulas
- Geometric Shapes
- Sheet Music
Explore the capabilities of this cutting-edge model through this interactive demo!
Upload an image
or
select the draw tool to start
Select Task
OCR Type
OCR Color
Examples
| Input Image | Select Task | OCR Type |
|---|
Fine-grained example
| Image Editor | Select Task | OCR Type | OCR Color |
|---|
Space based on Tonic's GOT-OCR