Model Card for outputs

This model is a fine-tuned version of unsloth/qwen2-vl-7b-instruct-unsloth-bnb-4bit. It has been trained using TRL.

##A transformer-based OCR model fine-tuned for recognizing Urdu text from images.

This repository contains a fine-tuned VisionEncoderDecoderModel built on top of TrOCR for Urdu Optical Character Recognition (OCR). The model is trained to extract Urdu text from scanned documents, printed pages, and image-based text inputs.

Open model View benchmark notebook


## Highlights

Fine-tuned specifically for Urdu script recognition.

Works on scanned pages, screenshots, and cropped text regions.

Built using Hugging Face Transformers and TrOCR.

Easy inference pipeline with minimal code..

### Quick Start

Install dependencies

Load the model

Run inference

###Training Procedure

This model was fine-tuned using supervised learning on paired image–text data for Urdu OCR.
###Training details
###Parameter           Value
  Base model           microsoft/trocr-base-handwritten
  Task                 Sequence-to-sequence OCR
  Framework            Transformers Trainer API
  Optimization         Cross-entropy loss

###Intended Use
###Suitable for
Digitizing Urdu books and documents.
Extracting text from scanned PDFs.
OCR preprocessing for NLP pipelines.
Research and educational projects involving Urdu script.

## Citations



Cite TRL as:
    
```bibtex
@misc{vonwerra2022trl,
    title        = {{TRL: Transformer Reinforcement Learning}},
    author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
    year         = 2020,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/huggingface/trl}}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support