GOT-OCR 2.0: Transformers 🤗 implementation demo

This demo utilizes the Transformers implementation of GOT-OCR 2.0 to extract text from images. The GOT-OCR 2.0 model was introduced in the paper: General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model by Haoran Wei, Chenglong Liu, Jinyue Chen, Jia Wang, Lingyu Kong, Yanming Xu, Zheng Ge, Liang Zhao, Jianjian Sun, Yuang Peng, Chunrui Han, and Xiangyu Zhang.

Key Features

GOT-OCR 2.0 is a state-of-the-art OCR model designed to handle a wide variety of tasks, including:

  • Plain Text OCR
  • Formatted Text OCR
  • Fine-grained OCR
  • Multi-crop OCR
  • Multi-page OCR

Beyond Text

GOT-OCR 2.0 has also been fine-tuned to work with non-textual data, such as:

  • Charts and Tables
  • Math and Molecular Formulas
  • Geometric Shapes
  • Sheet Music

Explore the capabilities of this cutting-edge model through this interactive demo!

Select Task
Examples
Input Image Select Task OCR Type
Fine-grained example
Image Editor Select Task OCR Type OCR Color

Space based on Tonic's GOT-OCR