yassine-mhirsi's picture
Remove example_usage.py file as it is no longer needed following the restructuring of the pipeline to include car detection. Update documentation to reflect the new four-stage process: car detection, plate detection, word detection, and OCR.
ff7b80d
metadata
title: Tunisian License Plate Detection OCR
emoji: 😻
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
license: mit

πŸš— Tunisian License Plate Detection & OCR

A complete pipeline for detecting and extracting text from Tunisian vehicle license plates using state-of-the-art deep learning models.

🎯 Overview

This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of four main stages:

  1. Car Detection: Uses a custom CNN trained from scratch to detect the vehicle region
  2. License Plate Detection: Uses YOLOv8n to detect and localize license plates within the car region
  3. Word Detection: Uses YOLOv8s to detect the Arabic word "ΨͺΩˆΩ†Ψ³" (Tunis) on the plate
  4. Text Extraction: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text

πŸ—οΈ Architecture

Input Image β†’ Car Detection (Custom CNN) β†’ Crop Car β†’
Plate Detection (YOLOv8n) β†’ Crop Plate β†’ 
Word Detection (YOLOv8s) β†’ Mask Word β†’ OCR (TrOCR) β†’ Output Text

Models Used

  • Car Detection: Safe-Drive-TN/Car-detection-from-scratch (custom CNN)
  • Plate Detection: Safe-Drive-TN/Tunisian-Licence-plate-Detection (YOLOv8n)
  • Word Detection: Safe-Drive-TN/tunis-word-detection-yolov8s (YOLOv8s)
  • OCR: microsoft/trocr-base-printed (TrOCR)

All models are hosted on HuggingFace Hub and loaded automatically at runtime.

πŸš€ Quick Start

Using Docker (Recommended)

# Build the Docker image
docker build -t tunisian-license-plate-ocr .

# Run the container
docker run -p 7860:7860 -p 8000:8000 tunisian-license-plate-ocr

Then access:

Local Installation

# Clone the repository
git clone https://github.com/yourusername/Tunisian-License-Plate-Detection-OCR.git
cd Tunisian-License-Plate-Detection-OCR

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
echo "HUGGINGFACE_TOKEN=your_token_here" > .env

# Run the Gradio interface
python -m app.gradio_app

# Or run the FastAPI server
python -m app.main

πŸ“‘ API Endpoints

1. Detect Car

POST /detect-car

Detect and localize the vehicle region in an image.

Response:

{
  "success": true,
  "bbox": [x1, y1, x2, y2],
  "confidence": 0.87
}

2. Complete Pipeline

POST /process

Process the full pipeline from image to extracted text.

Request:

  • Content-Type: multipart/form-data
  • Body: Image file

Response:

{
  "success": true,
  "text": "12345TU6789",
  "confidence": {
    "plate_detection": 0.95,
    "word_detection": 0.88,
    "ocr": 0.92,
    "overall": 0.92
  }
}

3. Detect License Plate

POST /detect-plate

Detect and localize license plate in an image.

Response:

{
  "success": true,
  "bbox": [x1, y1, x2, y2],
  "confidence": 0.95,
  "class_id": 0
}

4. Detect Word

POST /detect-word

Detect "ΨͺΩˆΩ†Ψ³" word in a license plate image.

Response:

{
  "success": true,
  "bbox": [x1, y1, x2, y2],
  "confidence": 0.88,
  "class_id": 0
}

5. Extract Text

POST /extract-text

Extract text from a license plate image using OCR.

Response:

{
  "success": true,
  "text": "12345TU6789",
  "confidence": 0.92
}

6. Health Check

GET /health

Check API health status.

🎨 Gradio Interface

The Gradio interface provides two viewing modes:

Simple Mode (Default)

  • Upload an image
  • View the extracted license plate text
  • See overall confidence scores

Detailed Mode

  • View all intermediate processing steps:
    1. Original image with detected car bounding box
    2. Cropped car region
    3. Car crop with detected license plate
    4. Cropped license plate
    5. Plate with detected word highlighted
    6. Final masked plate used for OCR
  • See confidence scores for each step

πŸ“Š Dataset

The project uses three datasets:

  • datasets/text/: License plate images with ground truth labels

    • train/: 566 training images
    • val/: 141 validation images
    • CSV files with image paths and labels
  • datasets/word/: YOLO format dataset for word detection

    • Training, validation, and test sets
    • Annotations in YOLO format
  • datasets/tunisian-license-plate/: Combined dataset of 706 images

Sample images are included in the samples/ directory for testing.

πŸ”§ Configuration

Configuration is managed in app/utils/config.py:

# Model IDs
CAR_DETECTION_MODEL = "Safe-Drive-TN/Car-detection-from-scratch"
PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection"
WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s"
OCR_MODEL = "microsoft/trocr-base-printed"

# Confidence Thresholds
CAR_DETECTION_CONFIDENCE = 0.6
PLATE_DETECTION_CONFIDENCE = 0.25
WORD_DETECTION_CONFIDENCE = 0.25
OCR_CONFIDENCE_THRESHOLD = 0.5

πŸ“ Project Structure

Tunisian-License-Plate-Detection-OCR/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ plate_detector.py    # YOLOv8n plate detection
β”‚   β”‚   β”œβ”€β”€ word_detector.py     # YOLOv8s word detection
β”‚   β”‚   β”œβ”€β”€ ocr_model.py         # TrOCR text extraction
β”‚   β”‚   └── car_detector.py      # Custom CNN car detection
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   └── pipeline.py          # Main pipeline orchestration
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   β”œβ”€β”€ config.py            # Configuration
β”‚   β”‚   └── image_processing.py # Image utilities
β”‚   β”œβ”€β”€ main.py                  # FastAPI application
β”‚   └── gradio_app.py           # Gradio interface
β”œβ”€β”€ datasets/                    # Training/validation datasets
β”œβ”€β”€ samples/                     # Sample images for testing
β”œβ”€β”€ requirements.txt             # Python dependencies
β”œβ”€β”€ Dockerfile                   # Docker configuration
β”œβ”€β”€ .env                        # Environment variables
└── README.md                   # This file

πŸ› οΈ Development

Adding New Features

  1. New Model: Add to app/models/ and update config.py
  2. New Endpoint: Add to app/main.py
  3. Pipeline Modification: Update app/services/pipeline.py

Testing

# Test the complete pipeline
python -c "
from app.services.pipeline import get_pipeline
import cv2

pipeline = get_pipeline()
image = cv2.imread('samples/0.jpg')
result = pipeline.process_full_pipeline(image)
print(result)
"

🚒 Deployment

HuggingFace Spaces

This repository is configured for deployment on HuggingFace Spaces:

  1. Push to HuggingFace Space repository
  2. Spaces will automatically build and deploy using the Dockerfile
  3. Add your HUGGINGFACE_TOKEN as a Space secret

Other Platforms

The Docker image can be deployed on any platform supporting Docker:

  • AWS ECS/Fargate
  • Google Cloud Run
  • Azure Container Instances
  • Kubernetes

πŸ“ Requirements

  • Python 3.10+
  • CUDA (optional, for GPU acceleration)
  • 4GB+ RAM
  • HuggingFace account and token

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Safe-Drive-TN for the YOLOv8 models
  • Microsoft for TrOCR
  • HuggingFace for model hosting and transformers library
  • Ultralytics for YOLOv8 implementation

πŸ“§ Contact

For questions or issues, please open an issue on GitHub.


Made with ❀️ for Tunisian License Plate Recognition