Spaces:

Safe-Drive-TN
/

Tunisian-License-Plate-Detection-OCR

Running

Tunisian-License-Plate-Detection-OCR / README.md

Remove example_usage.py file as it is no longer needed following the restructuring of the pipeline to include car detection. Update documentation to reflect the new four-stage process: car detection, plate detection, word detection, and OCR.

ff7b80d 27 days ago

preview code

raw

history blame contribute delete

7.77 kB

metadata

title: Tunisian License Plate Detection OCR
emoji: 😻
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
license: mit

🚗 Tunisian License Plate Detection & OCR

A complete pipeline for detecting and extracting text from Tunisian vehicle license plates using state-of-the-art deep learning models.

🎯 Overview

This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of four main stages:

Car Detection: Uses a custom CNN trained from scratch to detect the vehicle region
License Plate Detection: Uses YOLOv8n to detect and localize license plates within the car region
Word Detection: Uses YOLOv8s to detect the Arabic word "تونس" (Tunis) on the plate
Text Extraction: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text

🏗️ Architecture

Input Image → Car Detection (Custom CNN) → Crop Car →
Plate Detection (YOLOv8n) → Crop Plate → 
Word Detection (YOLOv8s) → Mask Word → OCR (TrOCR) → Output Text

Models Used

Car Detection: Safe-Drive-TN/Car-detection-from-scratch (custom CNN)
Plate Detection: Safe-Drive-TN/Tunisian-Licence-plate-Detection (YOLOv8n)
Word Detection: Safe-Drive-TN/tunis-word-detection-yolov8s (YOLOv8s)
OCR: microsoft/trocr-base-printed (TrOCR)

All models are hosted on HuggingFace Hub and loaded automatically at runtime.

🚀 Quick Start

Using Docker (Recommended)

# Build the Docker image
docker build -t tunisian-license-plate-ocr .

# Run the container
docker run -p 7860:7860 -p 8000:8000 tunisian-license-plate-ocr

Then access:

Gradio Interface: http://localhost:7860
API Documentation: http://localhost:8000/docs

Local Installation

# Clone the repository
git clone https://github.com/yourusername/Tunisian-License-Plate-Detection-OCR.git
cd Tunisian-License-Plate-Detection-OCR

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
echo "HUGGINGFACE_TOKEN=your_token_here" > .env

# Run the Gradio interface
python -m app.gradio_app

# Or run the FastAPI server
python -m app.main

📡 API Endpoints

1. Detect Car

POST /detect-car

Detect and localize the vehicle region in an image.

Response:

{
  "success": true,
  "bbox": [x1, y1, x2, y2],
  "confidence": 0.87
}

2. Complete Pipeline

POST /process

Process the full pipeline from image to extracted text.

Request:

Content-Type: multipart/form-data
Body: Image file

Response:

{
  "success": true,
  "text": "12345TU6789",
  "confidence": {
    "plate_detection": 0.95,
    "word_detection": 0.88,
    "ocr": 0.92,
    "overall": 0.92
  }
}

3. Detect License Plate

POST /detect-plate

Detect and localize license plate in an image.

Response:

{
  "success": true,
  "bbox": [x1, y1, x2, y2],
  "confidence": 0.95,
  "class_id": 0
}

4. Detect Word

POST /detect-word

Detect "تونس" word in a license plate image.

Response:

{
  "success": true,
  "bbox": [x1, y1, x2, y2],
  "confidence": 0.88,
  "class_id": 0
}

5. Extract Text

POST /extract-text

Extract text from a license plate image using OCR.

Response:

{
  "success": true,
  "text": "12345TU6789",
  "confidence": 0.92
}

6. Health Check

GET /health

Check API health status.

🎨 Gradio Interface

The Gradio interface provides two viewing modes:

Simple Mode (Default)

Upload an image
View the extracted license plate text
See overall confidence scores

Detailed Mode

View all intermediate processing steps:
1. Original image with detected car bounding box
2. Cropped car region
3. Car crop with detected license plate
4. Cropped license plate
5. Plate with detected word highlighted
6. Final masked plate used for OCR
See confidence scores for each step

📊 Dataset

The project uses three datasets:

datasets/text/: License plate images with ground truth labels
- train/: 566 training images
- val/: 141 validation images
- CSV files with image paths and labels
datasets/word/: YOLO format dataset for word detection
- Training, validation, and test sets
- Annotations in YOLO format
datasets/tunisian-license-plate/: Combined dataset of 706 images

Sample images are included in the samples/ directory for testing.

🔧 Configuration

Configuration is managed in app/utils/config.py:

# Model IDs
CAR_DETECTION_MODEL = "Safe-Drive-TN/Car-detection-from-scratch"
PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection"
WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s"
OCR_MODEL = "microsoft/trocr-base-printed"

# Confidence Thresholds
CAR_DETECTION_CONFIDENCE = 0.6
PLATE_DETECTION_CONFIDENCE = 0.25
WORD_DETECTION_CONFIDENCE = 0.25
OCR_CONFIDENCE_THRESHOLD = 0.5

📁 Project Structure

Tunisian-License-Plate-Detection-OCR/
├── app/
│   ├── models/
│   │   ├── plate_detector.py    # YOLOv8n plate detection
│   │   ├── word_detector.py     # YOLOv8s word detection
│   │   ├── ocr_model.py         # TrOCR text extraction
│   │   └── car_detector.py      # Custom CNN car detection
│   ├── services/
│   │   └── pipeline.py          # Main pipeline orchestration
│   ├── utils/
│   │   ├── config.py            # Configuration
│   │   └── image_processing.py # Image utilities
│   ├── main.py                  # FastAPI application
│   └── gradio_app.py           # Gradio interface
├── datasets/                    # Training/validation datasets
├── samples/                     # Sample images for testing
├── requirements.txt             # Python dependencies
├── Dockerfile                   # Docker configuration
├── .env                        # Environment variables
└── README.md                   # This file

🛠️ Development

Adding New Features

New Model: Add to app/models/ and update config.py
New Endpoint: Add to app/main.py
Pipeline Modification: Update app/services/pipeline.py

Testing

# Test the complete pipeline
python -c "
from app.services.pipeline import get_pipeline
import cv2

pipeline = get_pipeline()
image = cv2.imread('samples/0.jpg')
result = pipeline.process_full_pipeline(image)
print(result)
"

🚢 Deployment

HuggingFace Spaces

This repository is configured for deployment on HuggingFace Spaces:

Push to HuggingFace Space repository
Spaces will automatically build and deploy using the Dockerfile
Add your HUGGINGFACE_TOKEN as a Space secret

Other Platforms

The Docker image can be deployed on any platform supporting Docker:

AWS ECS/Fargate
Google Cloud Run
Azure Container Instances
Kubernetes

📝 Requirements

Python 3.10+
CUDA (optional, for GPU acceleration)
4GB+ RAM
HuggingFace account and token

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Safe-Drive-TN for the YOLOv8 models
Microsoft for TrOCR
HuggingFace for model hosting and transformers library
Ultralytics for YOLOv8 implementation

📧 Contact

For questions or issues, please open an issue on GitHub.

Made with ❤️ for Tunisian License Plate Recognition