Spaces:

Safe-Drive-TN
/

Tunisian-License-Plate-Detection-OCR

Running

App Files Files Community

yassine-mhirsi commited on 27 days ago

Commit

ff7b80d

1 Parent(s): 3ecb44d

Remove example_usage.py file as it is no longer needed following the restructuring of the pipeline to include car detection. Update documentation to reflect the new four-stage process: car detection, plate detection, word detection, and OCR.

Browse files

Files changed (4) hide show

IMPLEMENTATION_SUMMARY.md +26 -12
QUICKSTART.md +11 -2
README.md +37 -15
example_usage.py +0 -195

IMPLEMENTATION_SUMMARY.md CHANGED Viewed

@@ -16,7 +16,8 @@ Tunisian-License-Plate-Detection-OCR/
 │   │   ├── __init__.py
 │   │   ├── plate_detector.py       # YOLOv8n plate detection
 │   │   ├── word_detector.py        # YOLOv8s word detection
-│   │   └── ocr_model.py            # TrOCR text extraction
 │   ├── services/
 │   │   ├── __init__.py
 │   │   └── pipeline.py             # Pipeline orchestration
@@ -46,6 +47,13 @@ Total Files Created: 20+ files
 ### 1. Core Pipeline Components
 #### ✅ Plate Detector (`app/models/plate_detector.py`)
 - Uses YOLOv8n from HuggingFace (`Safe-Drive-TN/Tunisian-Licence-plate-Detection`)
 - Detects and localizes license plates in vehicle images
@@ -66,12 +74,14 @@ Total Files Created: 20+ files
 ### 2. Pipeline Service (`app/services/pipeline.py`)
 #### ✅ Complete Processing Pipeline
-1. Detect license plate in image
-2. Crop plate region
-3. Detect "تونس" word in plate
-4. Mask word with black box
-5. Extract text using OCR
-6. Return results with confidence scores
 #### ✅ Individual Step Methods
 - `detect_plate_only()` - Plate detection only
@@ -86,6 +96,7 @@ Total Files Created: 20+ files
 | Endpoint | Method | Description |
 |----------|--------|-------------|
 | `/` | GET | API information |
 | `/health` | GET | Health check |
 | `/detect-plate` | POST | Detect license plate |
@@ -112,11 +123,13 @@ Total Files Created: 20+ files
 **Detailed View:**
 - Upload image
-- Display 4 processing steps:
-  1. Original with plate detection
-  2. Cropped plate
-  3. Word detection highlighted
-  4. Masked plate for OCR
 - Show detailed confidence scores
 - Visual pipeline representation
@@ -200,6 +213,7 @@ Total Files Created: 20+ files
 - FastAPI & Uvicorn (API framework)
 - Gradio (UI framework)
 - PyTorch (Deep learning)
 - Transformers (TrOCR)
 - Ultralytics (YOLOv8)
 - OpenCV (Image processing)

 │   │   ├── __init__.py
 │   │   ├── plate_detector.py       # YOLOv8n plate detection
 │   │   ├── word_detector.py        # YOLOv8s word detection
+│   │   ├── ocr_model.py            # TrOCR text extraction
+│   │   └── car_detector.py         # Custom CNN car detection
 │   ├── services/
 │   │   ├── __init__.py
 │   │   └── pipeline.py             # Pipeline orchestration
 ### 1. Core Pipeline Components
+#### ✅ Car Detector (`app/models/car_detector.py`)
+- Custom CNN trained from scratch on Stanford Cars
+- Loaded from HuggingFace repo `Safe-Drive-TN/Car-detection-from-scratch`
+- Performs vehicle localization before plate detection
+- Confidence scoring based on bounding-box size and location
+- Provides reusable `detect_car` helper
 #### ✅ Plate Detector (`app/models/plate_detector.py`)
 - Uses YOLOv8n from HuggingFace (`Safe-Drive-TN/Tunisian-Licence-plate-Detection`)
 - Detects and localizes license plates in vehicle images
 ### 2. Pipeline Service (`app/services/pipeline.py`)
 #### ✅ Complete Processing Pipeline
+1. Detect vehicle using custom CNN
+2. Crop car region
+3. Detect license plate within car
+4. Crop plate region
+5. Detect "تونس" word in plate
+6. Mask word with black box
+7. Extract text using OCR
+8. Return results with confidence scores
 #### ✅ Individual Step Methods
 - `detect_plate_only()` - Plate detection only
 | Endpoint | Method | Description |
 |----------|--------|-------------|
+| `/detect-car` | POST | Detect vehicle bounding box |
 | `/` | GET | API information |
 | `/health` | GET | Health check |
 | `/detect-plate` | POST | Detect license plate |
 **Detailed View:**
 - Upload image
+- Display 6 processing steps:
+  1. Original with car detection
+  2. Cropped car region
+  3. Car crop with plate detection
+  4. Cropped plate
+  5. Word detection highlighted
+  6. Masked plate for OCR
 - Show detailed confidence scores
 - Visual pipeline representation
 - FastAPI & Uvicorn (API framework)
 - Gradio (UI framework)
 - PyTorch (Deep learning)
+- torchvision (image transforms for car detector)
 - Transformers (TrOCR)
 - Ultralytics (YOLOv8)
 - OpenCV (Image processing)

QUICKSTART.md CHANGED Viewed

@@ -57,13 +57,22 @@ python -m app.main
 2. Upload an image
 3. Click "🚀 Process Image"
 4. See all intermediate processing steps:
-   - Original image with detected plate
-   - Cropped license plate
    - Word detection highlighted
    - Masked plate ready for OCR
 ## Using the API
 ### Example: Complete Pipeline
 ```bash

 2. Upload an image
 3. Click "🚀 Process Image"
 4. See all intermediate processing steps:
+   - Original image with detected car
+   - Plate detection overlay
+   - Car crop highlighting the plate
    - Word detection highlighted
    - Masked plate ready for OCR
 ## Using the API
+### Example: Detect Car
+```bash
+curl -X POST "http://localhost:8000/detect-car" \
+  -H "Content-Type: multipart/form-data" \
+  -F "file=@path/to/your/image.jpg"
+```
 ### Example: Complete Pipeline
 ```bash

README.md CHANGED Viewed

@@ -14,21 +14,24 @@ A complete pipeline for detecting and extracting text from Tunisian vehicle lice
 ## 🎯 Overview
-This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of three main stages:
-1. **License Plate Detection**: Uses YOLOv8n to detect and localize license plates in vehicle images
-2. **Word Detection**: Uses YOLOv8s to detect the Arabic word "تونس" (Tunis) on the plate
-3. **Text Extraction**: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text
 ## 🏗️ Architecture
 ```
-Input Image → Plate Detection (YOLOv8n) → Crop Plate →
 Word Detection (YOLOv8s) → Mask Word → OCR (TrOCR) → Output Text
 ```
 ### Models Used
 - **Plate Detection**: `Safe-Drive-TN/Tunisian-Licence-plate-Detection` (YOLOv8n)
 - **Word Detection**: `Safe-Drive-TN/tunis-word-detection-yolov8s` (YOLOv8s)
 - **OCR**: `microsoft/trocr-base-printed` (TrOCR)
@@ -73,7 +76,21 @@ python -m app.main
 ## 📡 API Endpoints
-### 1. Complete Pipeline
 **POST** `/process`
 Process the full pipeline from image to extracted text.
@@ -96,7 +113,7 @@ Process the full pipeline from image to extracted text.
 }
 ```
-### 2. Detect License Plate
 **POST** `/detect-plate`
 Detect and localize license plate in an image.
@@ -111,7 +128,7 @@ Detect and localize license plate in an image.
 }
 ```
-### 3. Detect Word
 **POST** `/detect-word`
 Detect "تونس" word in a license plate image.
@@ -126,7 +143,7 @@ Detect "تونس" word in a license plate image.
 }
 ```
-### 4. Extract Text
 **POST** `/extract-text`
 Extract text from a license plate image using OCR.
@@ -140,7 +157,7 @@ Extract text from a license plate image using OCR.
 }
 ```
-### 5. Health Check
 **GET** `/health`
 Check API health status.
@@ -156,10 +173,12 @@ The Gradio interface provides two viewing modes:
 ### Detailed Mode
 - View all intermediate processing steps:
-  1. Original image with detected plate bounding box
-  2. Cropped license plate region
-  3. License plate with detected word highlighted
-  4. Final masked plate used for OCR
 - See confidence scores for each step
 ## 📊 Dataset
@@ -185,11 +204,13 @@ Configuration is managed in `app/utils/config.py`:
 ```python
 # Model IDs
 PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection"
 WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s"
 OCR_MODEL = "microsoft/trocr-base-printed"
 # Confidence Thresholds
 PLATE_DETECTION_CONFIDENCE = 0.25
 WORD_DETECTION_CONFIDENCE = 0.25
 OCR_CONFIDENCE_THRESHOLD = 0.5
@@ -203,7 +224,8 @@ Tunisian-License-Plate-Detection-OCR/
 │   ├── models/
 │   │   ├── plate_detector.py    # YOLOv8n plate detection
 │   │   ├── word_detector.py     # YOLOv8s word detection
-│   │   └── ocr_model.py         # TrOCR text extraction
 │   ├── services/
 │   │   └── pipeline.py          # Main pipeline orchestration
 │   ├── utils/

 ## 🎯 Overview
+This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of four main stages:
+1. **Car Detection**: Uses a custom CNN trained from scratch to detect the vehicle region
+2. **License Plate Detection**: Uses YOLOv8n to detect and localize license plates within the car region
+3. **Word Detection**: Uses YOLOv8s to detect the Arabic word "تونس" (Tunis) on the plate
+4. **Text Extraction**: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text
 ## 🏗️ Architecture
 ```
+Input Image → Car Detection (Custom CNN) → Crop Car →
+Plate Detection (YOLOv8n) → Crop Plate →
 Word Detection (YOLOv8s) → Mask Word → OCR (TrOCR) → Output Text
 ```
 ### Models Used
+- **Car Detection**: `Safe-Drive-TN/Car-detection-from-scratch` (custom CNN)
 - **Plate Detection**: `Safe-Drive-TN/Tunisian-Licence-plate-Detection` (YOLOv8n)
 - **Word Detection**: `Safe-Drive-TN/tunis-word-detection-yolov8s` (YOLOv8s)
 - **OCR**: `microsoft/trocr-base-printed` (TrOCR)
 ## 📡 API Endpoints
+### 1. Detect Car
+**POST** `/detect-car`
+Detect and localize the vehicle region in an image.
+**Response:**
+```json
+{
+  "success": true,
+  "bbox": [x1, y1, x2, y2],
+  "confidence": 0.87
+}
+```
+### 2. Complete Pipeline
 **POST** `/process`
 Process the full pipeline from image to extracted text.
 }
 ```
+### 3. Detect License Plate
 **POST** `/detect-plate`
 Detect and localize license plate in an image.
 }
 ```
+### 4. Detect Word
 **POST** `/detect-word`
 Detect "تونس" word in a license plate image.
 }
 ```
+### 5. Extract Text
 **POST** `/extract-text`
 Extract text from a license plate image using OCR.
 }
 ```
+### 6. Health Check
 **GET** `/health`
 Check API health status.
 ### Detailed Mode
 - View all intermediate processing steps:
+  1. Original image with detected car bounding box
+  2. Cropped car region
+  3. Car crop with detected license plate
+  4. Cropped license plate
+  5. Plate with detected word highlighted
+  6. Final masked plate used for OCR
 - See confidence scores for each step
 ## 📊 Dataset
 ```python
 # Model IDs
+CAR_DETECTION_MODEL = "Safe-Drive-TN/Car-detection-from-scratch"
 PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection"
 WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s"
 OCR_MODEL = "microsoft/trocr-base-printed"
 # Confidence Thresholds
+CAR_DETECTION_CONFIDENCE = 0.6
 PLATE_DETECTION_CONFIDENCE = 0.25
 WORD_DETECTION_CONFIDENCE = 0.25
 OCR_CONFIDENCE_THRESHOLD = 0.5
 │   ├── models/
 │   │   ├── plate_detector.py    # YOLOv8n plate detection
 │   │   ├── word_detector.py     # YOLOv8s word detection
+│   │   ├── ocr_model.py         # TrOCR text extraction
+│   │   └── car_detector.py      # Custom CNN car detection
 │   ├── services/
 │   │   └── pipeline.py          # Main pipeline orchestration
 │   ├── utils/

example_usage.py DELETED Viewed

@@ -1,195 +0,0 @@
-"""
-Example usage of the Tunisian License Plate Detection & OCR pipeline.
-This script demonstrates how to use the pipeline programmatically.
-"""
-import cv2
-import sys
-from pathlib import Path
-from app.services.pipeline import get_pipeline
-from app.utils.image_processing import draw_bbox
-def process_single_image(image_path: str, show_visualization: bool = True):
-    """
-    Process a single image and display results.
-    Args:
-        image_path: Path to the image file
-        show_visualization: Whether to show visualization
-    """
-    # Load image
-    image = cv2.imread(image_path)
-    if image is None:
-        print(f"Error: Could not load image from {image_path}")
-        return
-    print(f"\n{'='*60}")
-    print(f"Processing: {image_path}")
-    print(f"{'='*60}\n")
-    # Get pipeline
-    print("Loading models...")
-    pipeline = get_pipeline()
-    # Process image
-    print("Processing image...")
-    result = pipeline.process_full_pipeline(image)
-    # Display results
-    if result['success']:
-        print("✅ SUCCESS!")
-        print(f"\n📝 Extracted Text: {result['text']}")
-        print(f"\n📊 Confidence Scores:")
-        print(f"   - Plate Detection: {result['confidence']['plate_detection']:.2%}")
-        print(f"   - Word Detection: {result['confidence'].get('word_detection', 0):.2%}")
-        print(f"   - OCR: {result['confidence']['ocr']:.2%}")
-        print(f"   - Overall: {result['confidence']['overall']:.2%}")
-        # Show visualization if requested
-        if show_visualization:
-            show_results(image, result)
-    else:
-        print("❌ FAILED!")
-        print(f"Error: {result.get('error', 'Unknown error')}")
-    print(f"\n{'='*60}\n")
-def show_results(original_image, result):
-    """
-    Display visualization of results.
-    Args:
-        original_image: Original input image
-        result: Processing result dictionary
-    """
-    try:
-        import matplotlib.pyplot as plt
-        # Get intermediate results
-        intermediate = result.get('intermediate_results', {})
-        # Create figure with subplots
-        fig, axes = plt.subplots(2, 2, figsize=(12, 10))
-        fig.suptitle(f"License Plate: {result['text']}", fontsize=16, fontweight='bold')
-        # Original image with plate bbox
-        if 'plate_bbox' in intermediate:
-            img_with_bbox = draw_bbox(
-                original_image.copy(),
-                intermediate['plate_bbox'],
-                label=f"Conf: {result['confidence']['plate_detection']:.2f}",
-                color=(0, 255, 0)
-            )
-            axes[0, 0].imshow(cv2.cvtColor(img_with_bbox, cv2.COLOR_BGR2RGB))
-            axes[0, 0].set_title("1. Plate Detection")
-            axes[0, 0].axis('off')
-        # Cropped plate
-        if 'plate_image' in intermediate:
-            axes[0, 1].imshow(cv2.cvtColor(intermediate['plate_image'], cv2.COLOR_BGR2RGB))
-            axes[0, 1].set_title("2. Cropped Plate")
-            axes[0, 1].axis('off')
-        # Plate with word detection
-        if 'word_bbox' in intermediate and 'plate_image' in intermediate:
-            plate_with_word = draw_bbox(
-                intermediate['plate_image'].copy(),
-                intermediate['word_bbox'],
-                label=f"Conf: {result['confidence'].get('word_detection', 0):.2f}",
-                color=(255, 0, 0)
-            )
-            axes[1, 0].imshow(cv2.cvtColor(plate_with_word, cv2.COLOR_BGR2RGB))
-            axes[1, 0].set_title("3. Word Detection")
-            axes[1, 0].axis('off')
-        # Masked plate
-        if 'masked_plate' in intermediate:
-            axes[1, 1].imshow(cv2.cvtColor(intermediate['masked_plate'], cv2.COLOR_BGR2RGB))
-            axes[1, 1].set_title("4. Masked for OCR")
-            axes[1, 1].axis('off')
-        plt.tight_layout()
-        plt.show()
-    except ImportError:
-        print("\nNote: Install matplotlib to see visualizations")
-        print("pip install matplotlib")
-def process_directory(directory_path: str):
-    """
-    Process all images in a directory.
-    Args:
-        directory_path: Path to directory containing images
-    """
-    directory = Path(directory_path)
-    # Find all image files
-    image_extensions = ['.jpg', '.jpeg', '.png', '.bmp']
-    image_files = []
-    for ext in image_extensions:
-        image_files.extend(directory.glob(f'*{ext}'))
-        image_files.extend(directory.glob(f'*{ext.upper()}'))
-    if not image_files:
-        print(f"No images found in {directory_path}")
-        return
-    print(f"\nFound {len(image_files)} images")
-    # Process each image
-    results = []
-    for image_path in image_files:
-        image = cv2.imread(str(image_path))
-        if image is None:
-            continue
-        pipeline = get_pipeline()
-        result = pipeline.process_full_pipeline(image)
-        results.append({
-            'filename': image_path.name,
-            'success': result['success'],
-            'text': result.get('text', ''),
-            'confidence': result.get('confidence', {}).get('overall', 0)
-        })
-        status = "✅" if result['success'] else "❌"
-        text = result.get('text', 'N/A')
-        print(f"{status} {image_path.name}: {text}")
-    # Summary
-    successful = sum(1 for r in results if r['success'])
-    print(f"\n{'='*60}")
-    print(f"Summary: {successful}/{len(results)} images processed successfully")
-    print(f"{'='*60}")
-def main():
-    """Main function."""
-    if len(sys.argv) < 2:
-        print("Usage:")
-        print("  python example_usage.py <image_path>")
-        print("  python example_usage.py <directory_path> --batch")
-        print("\nExamples:")
-        print("  python example_usage.py samples/0.jpg")
-        print("  python example_usage.py samples/ --batch")
-        return
-    path = sys.argv[1]
-    if len(sys.argv) > 2 and sys.argv[2] == '--batch':
-        # Process directory
-        process_directory(path)
-    else:
-        # Process single image
-        process_single_image(path, show_visualization=True)
-if __name__ == "__main__":
-    main()