title: Tunisian License Plate Detection OCR
emoji: π»
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
license: mit
π Tunisian License Plate Detection & OCR
A complete pipeline for detecting and extracting text from Tunisian vehicle license plates using state-of-the-art deep learning models.
π― Overview
This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of four main stages:
- Car Detection: Uses a custom CNN trained from scratch to detect the vehicle region
- License Plate Detection: Uses YOLOv8n to detect and localize license plates within the car region
- Word Detection: Uses YOLOv8s to detect the Arabic word "ΨͺΩΩΨ³" (Tunis) on the plate
- Text Extraction: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text
ποΈ Architecture
Input Image β Car Detection (Custom CNN) β Crop Car β
Plate Detection (YOLOv8n) β Crop Plate β
Word Detection (YOLOv8s) β Mask Word β OCR (TrOCR) β Output Text
Models Used
- Car Detection:
Safe-Drive-TN/Car-detection-from-scratch(custom CNN) - Plate Detection:
Safe-Drive-TN/Tunisian-Licence-plate-Detection(YOLOv8n) - Word Detection:
Safe-Drive-TN/tunis-word-detection-yolov8s(YOLOv8s) - OCR:
microsoft/trocr-base-printed(TrOCR)
All models are hosted on HuggingFace Hub and loaded automatically at runtime.
π Quick Start
Using Docker (Recommended)
# Build the Docker image
docker build -t tunisian-license-plate-ocr .
# Run the container
docker run -p 7860:7860 -p 8000:8000 tunisian-license-plate-ocr
Then access:
- Gradio Interface: http://localhost:7860
- API Documentation: http://localhost:8000/docs
Local Installation
# Clone the repository
git clone https://github.com/yourusername/Tunisian-License-Plate-Detection-OCR.git
cd Tunisian-License-Plate-Detection-OCR
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
echo "HUGGINGFACE_TOKEN=your_token_here" > .env
# Run the Gradio interface
python -m app.gradio_app
# Or run the FastAPI server
python -m app.main
π‘ API Endpoints
1. Detect Car
POST /detect-car
Detect and localize the vehicle region in an image.
Response:
{
"success": true,
"bbox": [x1, y1, x2, y2],
"confidence": 0.87
}
2. Complete Pipeline
POST /process
Process the full pipeline from image to extracted text.
Request:
- Content-Type:
multipart/form-data - Body: Image file
Response:
{
"success": true,
"text": "12345TU6789",
"confidence": {
"plate_detection": 0.95,
"word_detection": 0.88,
"ocr": 0.92,
"overall": 0.92
}
}
3. Detect License Plate
POST /detect-plate
Detect and localize license plate in an image.
Response:
{
"success": true,
"bbox": [x1, y1, x2, y2],
"confidence": 0.95,
"class_id": 0
}
4. Detect Word
POST /detect-word
Detect "ΨͺΩΩΨ³" word in a license plate image.
Response:
{
"success": true,
"bbox": [x1, y1, x2, y2],
"confidence": 0.88,
"class_id": 0
}
5. Extract Text
POST /extract-text
Extract text from a license plate image using OCR.
Response:
{
"success": true,
"text": "12345TU6789",
"confidence": 0.92
}
6. Health Check
GET /health
Check API health status.
π¨ Gradio Interface
The Gradio interface provides two viewing modes:
Simple Mode (Default)
- Upload an image
- View the extracted license plate text
- See overall confidence scores
Detailed Mode
- View all intermediate processing steps:
- Original image with detected car bounding box
- Cropped car region
- Car crop with detected license plate
- Cropped license plate
- Plate with detected word highlighted
- Final masked plate used for OCR
- See confidence scores for each step
π Dataset
The project uses three datasets:
datasets/text/: License plate images with ground truth labelstrain/: 566 training imagesval/: 141 validation images- CSV files with image paths and labels
datasets/word/: YOLO format dataset for word detection- Training, validation, and test sets
- Annotations in YOLO format
datasets/tunisian-license-plate/: Combined dataset of 706 images
Sample images are included in the samples/ directory for testing.
π§ Configuration
Configuration is managed in app/utils/config.py:
# Model IDs
CAR_DETECTION_MODEL = "Safe-Drive-TN/Car-detection-from-scratch"
PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection"
WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s"
OCR_MODEL = "microsoft/trocr-base-printed"
# Confidence Thresholds
CAR_DETECTION_CONFIDENCE = 0.6
PLATE_DETECTION_CONFIDENCE = 0.25
WORD_DETECTION_CONFIDENCE = 0.25
OCR_CONFIDENCE_THRESHOLD = 0.5
π Project Structure
Tunisian-License-Plate-Detection-OCR/
βββ app/
β βββ models/
β β βββ plate_detector.py # YOLOv8n plate detection
β β βββ word_detector.py # YOLOv8s word detection
β β βββ ocr_model.py # TrOCR text extraction
β β βββ car_detector.py # Custom CNN car detection
β βββ services/
β β βββ pipeline.py # Main pipeline orchestration
β βββ utils/
β β βββ config.py # Configuration
β β βββ image_processing.py # Image utilities
β βββ main.py # FastAPI application
β βββ gradio_app.py # Gradio interface
βββ datasets/ # Training/validation datasets
βββ samples/ # Sample images for testing
βββ requirements.txt # Python dependencies
βββ Dockerfile # Docker configuration
βββ .env # Environment variables
βββ README.md # This file
π οΈ Development
Adding New Features
- New Model: Add to
app/models/and updateconfig.py - New Endpoint: Add to
app/main.py - Pipeline Modification: Update
app/services/pipeline.py
Testing
# Test the complete pipeline
python -c "
from app.services.pipeline import get_pipeline
import cv2
pipeline = get_pipeline()
image = cv2.imread('samples/0.jpg')
result = pipeline.process_full_pipeline(image)
print(result)
"
π’ Deployment
HuggingFace Spaces
This repository is configured for deployment on HuggingFace Spaces:
- Push to HuggingFace Space repository
- Spaces will automatically build and deploy using the Dockerfile
- Add your
HUGGINGFACE_TOKENas a Space secret
Other Platforms
The Docker image can be deployed on any platform supporting Docker:
- AWS ECS/Fargate
- Google Cloud Run
- Azure Container Instances
- Kubernetes
π Requirements
- Python 3.10+
- CUDA (optional, for GPU acceleration)
- 4GB+ RAM
- HuggingFace account and token
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- Safe-Drive-TN for the YOLOv8 models
- Microsoft for TrOCR
- HuggingFace for model hosting and transformers library
- Ultralytics for YOLOv8 implementation
π§ Contact
For questions or issues, please open an issue on GitHub.
Made with β€οΈ for Tunisian License Plate Recognition