--- title: Tunisian License Plate Detection OCR emoji: 😻 colorFrom: red colorTo: yellow sdk: docker pinned: false license: mit --- # 🚗 Tunisian License Plate Detection & OCR A complete pipeline for detecting and extracting text from Tunisian vehicle license plates using state-of-the-art deep learning models. ## 🎯 Overview This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of four main stages: 1. **Car Detection**: Uses a custom CNN trained from scratch to detect the vehicle region 2. **License Plate Detection**: Uses YOLOv8n to detect and localize license plates within the car region 3. **Word Detection**: Uses YOLOv8s to detect the Arabic word "تونس" (Tunis) on the plate 4. **Text Extraction**: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text ## 🏗️ Architecture ``` Input Image → Car Detection (Custom CNN) → Crop Car → Plate Detection (YOLOv8n) → Crop Plate → Word Detection (YOLOv8s) → Mask Word → OCR (TrOCR) → Output Text ``` ### Models Used - **Car Detection**: `Safe-Drive-TN/Car-detection-from-scratch` (custom CNN) - **Plate Detection**: `Safe-Drive-TN/Tunisian-Licence-plate-Detection` (YOLOv8n) - **Word Detection**: `Safe-Drive-TN/tunis-word-detection-yolov8s` (YOLOv8s) - **OCR**: `microsoft/trocr-base-printed` (TrOCR) All models are hosted on HuggingFace Hub and loaded automatically at runtime. ## 🚀 Quick Start ### Using Docker (Recommended) ```bash # Build the Docker image docker build -t tunisian-license-plate-ocr . # Run the container docker run -p 7860:7860 -p 8000:8000 tunisian-license-plate-ocr ``` Then access: - **Gradio Interface**: http://localhost:7860 - **API Documentation**: http://localhost:8000/docs ### Local Installation ```bash # Clone the repository git clone https://github.com/yourusername/Tunisian-License-Plate-Detection-OCR.git cd Tunisian-License-Plate-Detection-OCR # Install dependencies pip install -r requirements.txt # Set up environment variables echo "HUGGINGFACE_TOKEN=your_token_here" > .env # Run the Gradio interface python -m app.gradio_app # Or run the FastAPI server python -m app.main ``` ## 📡 API Endpoints ### 1. Detect Car **POST** `/detect-car` Detect and localize the vehicle region in an image. **Response:** ```json { "success": true, "bbox": [x1, y1, x2, y2], "confidence": 0.87 } ``` ### 2. Complete Pipeline **POST** `/process` Process the full pipeline from image to extracted text. **Request:** - Content-Type: `multipart/form-data` - Body: Image file **Response:** ```json { "success": true, "text": "12345TU6789", "confidence": { "plate_detection": 0.95, "word_detection": 0.88, "ocr": 0.92, "overall": 0.92 } } ``` ### 3. Detect License Plate **POST** `/detect-plate` Detect and localize license plate in an image. **Response:** ```json { "success": true, "bbox": [x1, y1, x2, y2], "confidence": 0.95, "class_id": 0 } ``` ### 4. Detect Word **POST** `/detect-word` Detect "تونس" word in a license plate image. **Response:** ```json { "success": true, "bbox": [x1, y1, x2, y2], "confidence": 0.88, "class_id": 0 } ``` ### 5. Extract Text **POST** `/extract-text` Extract text from a license plate image using OCR. **Response:** ```json { "success": true, "text": "12345TU6789", "confidence": 0.92 } ``` ### 6. Health Check **GET** `/health` Check API health status. ## 🎨 Gradio Interface The Gradio interface provides two viewing modes: ### Simple Mode (Default) - Upload an image - View the extracted license plate text - See overall confidence scores ### Detailed Mode - View all intermediate processing steps: 1. Original image with detected car bounding box 2. Cropped car region 3. Car crop with detected license plate 4. Cropped license plate 5. Plate with detected word highlighted 6. Final masked plate used for OCR - See confidence scores for each step ## 📊 Dataset The project uses three datasets: - **`datasets/text/`**: License plate images with ground truth labels - `train/`: 566 training images - `val/`: 141 validation images - CSV files with image paths and labels - **`datasets/word/`**: YOLO format dataset for word detection - Training, validation, and test sets - Annotations in YOLO format - **`datasets/tunisian-license-plate/`**: Combined dataset of 706 images Sample images are included in the `samples/` directory for testing. ## 🔧 Configuration Configuration is managed in `app/utils/config.py`: ```python # Model IDs CAR_DETECTION_MODEL = "Safe-Drive-TN/Car-detection-from-scratch" PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection" WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s" OCR_MODEL = "microsoft/trocr-base-printed" # Confidence Thresholds CAR_DETECTION_CONFIDENCE = 0.6 PLATE_DETECTION_CONFIDENCE = 0.25 WORD_DETECTION_CONFIDENCE = 0.25 OCR_CONFIDENCE_THRESHOLD = 0.5 ``` ## 📁 Project Structure ``` Tunisian-License-Plate-Detection-OCR/ ├── app/ │ ├── models/ │ │ ├── plate_detector.py # YOLOv8n plate detection │ │ ├── word_detector.py # YOLOv8s word detection │ │ ├── ocr_model.py # TrOCR text extraction │ │ └── car_detector.py # Custom CNN car detection │ ├── services/ │ │ └── pipeline.py # Main pipeline orchestration │ ├── utils/ │ │ ├── config.py # Configuration │ │ └── image_processing.py # Image utilities │ ├── main.py # FastAPI application │ └── gradio_app.py # Gradio interface ├── datasets/ # Training/validation datasets ├── samples/ # Sample images for testing ├── requirements.txt # Python dependencies ├── Dockerfile # Docker configuration ├── .env # Environment variables └── README.md # This file ``` ## 🛠️ Development ### Adding New Features 1. **New Model**: Add to `app/models/` and update `config.py` 2. **New Endpoint**: Add to `app/main.py` 3. **Pipeline Modification**: Update `app/services/pipeline.py` ### Testing ```bash # Test the complete pipeline python -c " from app.services.pipeline import get_pipeline import cv2 pipeline = get_pipeline() image = cv2.imread('samples/0.jpg') result = pipeline.process_full_pipeline(image) print(result) " ``` ## 🚢 Deployment ### HuggingFace Spaces This repository is configured for deployment on HuggingFace Spaces: 1. Push to HuggingFace Space repository 2. Spaces will automatically build and deploy using the Dockerfile 3. Add your `HUGGINGFACE_TOKEN` as a Space secret ### Other Platforms The Docker image can be deployed on any platform supporting Docker: - AWS ECS/Fargate - Google Cloud Run - Azure Container Instances - Kubernetes ## 📝 Requirements - Python 3.10+ - CUDA (optional, for GPU acceleration) - 4GB+ RAM - HuggingFace account and token ## 🤝 Contributing Contributions are welcome! Please feel free to submit a Pull Request. ## 📄 License This project is licensed under the MIT License - see the LICENSE file for details. ## 🙏 Acknowledgments - **Safe-Drive-TN** for the YOLOv8 models - **Microsoft** for TrOCR - **HuggingFace** for model hosting and transformers library - **Ultralytics** for YOLOv8 implementation ## 📧 Contact For questions or issues, please open an issue on GitHub. --- Made with ❤️ for Tunisian License Plate Recognition