yassine-mhirsi commited on
Commit
ff7b80d
Β·
1 Parent(s): 3ecb44d

Remove example_usage.py file as it is no longer needed following the restructuring of the pipeline to include car detection. Update documentation to reflect the new four-stage process: car detection, plate detection, word detection, and OCR.

Browse files
Files changed (4) hide show
  1. IMPLEMENTATION_SUMMARY.md +26 -12
  2. QUICKSTART.md +11 -2
  3. README.md +37 -15
  4. example_usage.py +0 -195
IMPLEMENTATION_SUMMARY.md CHANGED
@@ -16,7 +16,8 @@ Tunisian-License-Plate-Detection-OCR/
16
  β”‚ β”‚ β”œβ”€β”€ __init__.py
17
  β”‚ β”‚ β”œβ”€β”€ plate_detector.py # YOLOv8n plate detection
18
  β”‚ β”‚ β”œβ”€β”€ word_detector.py # YOLOv8s word detection
19
- β”‚ β”‚ └── ocr_model.py # TrOCR text extraction
 
20
  β”‚ β”œβ”€β”€ services/
21
  β”‚ β”‚ β”œβ”€β”€ __init__.py
22
  β”‚ β”‚ └── pipeline.py # Pipeline orchestration
@@ -46,6 +47,13 @@ Total Files Created: 20+ files
46
 
47
  ### 1. Core Pipeline Components
48
 
 
 
 
 
 
 
 
49
  #### βœ… Plate Detector (`app/models/plate_detector.py`)
50
  - Uses YOLOv8n from HuggingFace (`Safe-Drive-TN/Tunisian-Licence-plate-Detection`)
51
  - Detects and localizes license plates in vehicle images
@@ -66,12 +74,14 @@ Total Files Created: 20+ files
66
  ### 2. Pipeline Service (`app/services/pipeline.py`)
67
 
68
  #### βœ… Complete Processing Pipeline
69
- 1. Detect license plate in image
70
- 2. Crop plate region
71
- 3. Detect "ΨͺΩˆΩ†Ψ³" word in plate
72
- 4. Mask word with black box
73
- 5. Extract text using OCR
74
- 6. Return results with confidence scores
 
 
75
 
76
  #### βœ… Individual Step Methods
77
  - `detect_plate_only()` - Plate detection only
@@ -86,6 +96,7 @@ Total Files Created: 20+ files
86
 
87
  | Endpoint | Method | Description |
88
  |----------|--------|-------------|
 
89
  | `/` | GET | API information |
90
  | `/health` | GET | Health check |
91
  | `/detect-plate` | POST | Detect license plate |
@@ -112,11 +123,13 @@ Total Files Created: 20+ files
112
 
113
  **Detailed View:**
114
  - Upload image
115
- - Display 4 processing steps:
116
- 1. Original with plate detection
117
- 2. Cropped plate
118
- 3. Word detection highlighted
119
- 4. Masked plate for OCR
 
 
120
  - Show detailed confidence scores
121
  - Visual pipeline representation
122
 
@@ -200,6 +213,7 @@ Total Files Created: 20+ files
200
  - FastAPI & Uvicorn (API framework)
201
  - Gradio (UI framework)
202
  - PyTorch (Deep learning)
 
203
  - Transformers (TrOCR)
204
  - Ultralytics (YOLOv8)
205
  - OpenCV (Image processing)
 
16
  β”‚ β”‚ β”œβ”€β”€ __init__.py
17
  β”‚ β”‚ β”œβ”€β”€ plate_detector.py # YOLOv8n plate detection
18
  β”‚ β”‚ β”œβ”€β”€ word_detector.py # YOLOv8s word detection
19
+ β”‚ β”‚ β”œβ”€β”€ ocr_model.py # TrOCR text extraction
20
+ β”‚ β”‚ └── car_detector.py # Custom CNN car detection
21
  β”‚ β”œβ”€β”€ services/
22
  β”‚ β”‚ β”œβ”€β”€ __init__.py
23
  β”‚ β”‚ └── pipeline.py # Pipeline orchestration
 
47
 
48
  ### 1. Core Pipeline Components
49
 
50
+ #### βœ… Car Detector (`app/models/car_detector.py`)
51
+ - Custom CNN trained from scratch on Stanford Cars
52
+ - Loaded from HuggingFace repo `Safe-Drive-TN/Car-detection-from-scratch`
53
+ - Performs vehicle localization before plate detection
54
+ - Confidence scoring based on bounding-box size and location
55
+ - Provides reusable `detect_car` helper
56
+
57
  #### βœ… Plate Detector (`app/models/plate_detector.py`)
58
  - Uses YOLOv8n from HuggingFace (`Safe-Drive-TN/Tunisian-Licence-plate-Detection`)
59
  - Detects and localizes license plates in vehicle images
 
74
  ### 2. Pipeline Service (`app/services/pipeline.py`)
75
 
76
  #### βœ… Complete Processing Pipeline
77
+ 1. Detect vehicle using custom CNN
78
+ 2. Crop car region
79
+ 3. Detect license plate within car
80
+ 4. Crop plate region
81
+ 5. Detect "ΨͺΩˆΩ†Ψ³" word in plate
82
+ 6. Mask word with black box
83
+ 7. Extract text using OCR
84
+ 8. Return results with confidence scores
85
 
86
  #### βœ… Individual Step Methods
87
  - `detect_plate_only()` - Plate detection only
 
96
 
97
  | Endpoint | Method | Description |
98
  |----------|--------|-------------|
99
+ | `/detect-car` | POST | Detect vehicle bounding box |
100
  | `/` | GET | API information |
101
  | `/health` | GET | Health check |
102
  | `/detect-plate` | POST | Detect license plate |
 
123
 
124
  **Detailed View:**
125
  - Upload image
126
+ - Display 6 processing steps:
127
+ 1. Original with car detection
128
+ 2. Cropped car region
129
+ 3. Car crop with plate detection
130
+ 4. Cropped plate
131
+ 5. Word detection highlighted
132
+ 6. Masked plate for OCR
133
  - Show detailed confidence scores
134
  - Visual pipeline representation
135
 
 
213
  - FastAPI & Uvicorn (API framework)
214
  - Gradio (UI framework)
215
  - PyTorch (Deep learning)
216
+ - torchvision (image transforms for car detector)
217
  - Transformers (TrOCR)
218
  - Ultralytics (YOLOv8)
219
  - OpenCV (Image processing)
QUICKSTART.md CHANGED
@@ -57,13 +57,22 @@ python -m app.main
57
  2. Upload an image
58
  3. Click "πŸš€ Process Image"
59
  4. See all intermediate processing steps:
60
- - Original image with detected plate
61
- - Cropped license plate
 
62
  - Word detection highlighted
63
  - Masked plate ready for OCR
64
 
65
  ## Using the API
66
 
 
 
 
 
 
 
 
 
67
  ### Example: Complete Pipeline
68
 
69
  ```bash
 
57
  2. Upload an image
58
  3. Click "πŸš€ Process Image"
59
  4. See all intermediate processing steps:
60
+ - Original image with detected car
61
+ - Plate detection overlay
62
+ - Car crop highlighting the plate
63
  - Word detection highlighted
64
  - Masked plate ready for OCR
65
 
66
  ## Using the API
67
 
68
+ ### Example: Detect Car
69
+
70
+ ```bash
71
+ curl -X POST "http://localhost:8000/detect-car" \
72
+ -H "Content-Type: multipart/form-data" \
73
+ -F "file=@path/to/your/image.jpg"
74
+ ```
75
+
76
  ### Example: Complete Pipeline
77
 
78
  ```bash
README.md CHANGED
@@ -14,21 +14,24 @@ A complete pipeline for detecting and extracting text from Tunisian vehicle lice
14
 
15
  ## 🎯 Overview
16
 
17
- This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of three main stages:
18
 
19
- 1. **License Plate Detection**: Uses YOLOv8n to detect and localize license plates in vehicle images
20
- 2. **Word Detection**: Uses YOLOv8s to detect the Arabic word "ΨͺΩˆΩ†Ψ³" (Tunis) on the plate
21
- 3. **Text Extraction**: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text
 
22
 
23
  ## πŸ—οΈ Architecture
24
 
25
  ```
26
- Input Image β†’ Plate Detection (YOLOv8n) β†’ Crop Plate β†’
 
27
  Word Detection (YOLOv8s) β†’ Mask Word β†’ OCR (TrOCR) β†’ Output Text
28
  ```
29
 
30
  ### Models Used
31
 
 
32
  - **Plate Detection**: `Safe-Drive-TN/Tunisian-Licence-plate-Detection` (YOLOv8n)
33
  - **Word Detection**: `Safe-Drive-TN/tunis-word-detection-yolov8s` (YOLOv8s)
34
  - **OCR**: `microsoft/trocr-base-printed` (TrOCR)
@@ -73,7 +76,21 @@ python -m app.main
73
 
74
  ## πŸ“‘ API Endpoints
75
 
76
- ### 1. Complete Pipeline
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  **POST** `/process`
78
 
79
  Process the full pipeline from image to extracted text.
@@ -96,7 +113,7 @@ Process the full pipeline from image to extracted text.
96
  }
97
  ```
98
 
99
- ### 2. Detect License Plate
100
  **POST** `/detect-plate`
101
 
102
  Detect and localize license plate in an image.
@@ -111,7 +128,7 @@ Detect and localize license plate in an image.
111
  }
112
  ```
113
 
114
- ### 3. Detect Word
115
  **POST** `/detect-word`
116
 
117
  Detect "ΨͺΩˆΩ†Ψ³" word in a license plate image.
@@ -126,7 +143,7 @@ Detect "ΨͺΩˆΩ†Ψ³" word in a license plate image.
126
  }
127
  ```
128
 
129
- ### 4. Extract Text
130
  **POST** `/extract-text`
131
 
132
  Extract text from a license plate image using OCR.
@@ -140,7 +157,7 @@ Extract text from a license plate image using OCR.
140
  }
141
  ```
142
 
143
- ### 5. Health Check
144
  **GET** `/health`
145
 
146
  Check API health status.
@@ -156,10 +173,12 @@ The Gradio interface provides two viewing modes:
156
 
157
  ### Detailed Mode
158
  - View all intermediate processing steps:
159
- 1. Original image with detected plate bounding box
160
- 2. Cropped license plate region
161
- 3. License plate with detected word highlighted
162
- 4. Final masked plate used for OCR
 
 
163
  - See confidence scores for each step
164
 
165
  ## πŸ“Š Dataset
@@ -185,11 +204,13 @@ Configuration is managed in `app/utils/config.py`:
185
 
186
  ```python
187
  # Model IDs
 
188
  PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection"
189
  WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s"
190
  OCR_MODEL = "microsoft/trocr-base-printed"
191
 
192
  # Confidence Thresholds
 
193
  PLATE_DETECTION_CONFIDENCE = 0.25
194
  WORD_DETECTION_CONFIDENCE = 0.25
195
  OCR_CONFIDENCE_THRESHOLD = 0.5
@@ -203,7 +224,8 @@ Tunisian-License-Plate-Detection-OCR/
203
  β”‚ β”œβ”€β”€ models/
204
  β”‚ β”‚ β”œβ”€β”€ plate_detector.py # YOLOv8n plate detection
205
  β”‚ β”‚ β”œβ”€β”€ word_detector.py # YOLOv8s word detection
206
- β”‚ β”‚ └── ocr_model.py # TrOCR text extraction
 
207
  β”‚ β”œβ”€β”€ services/
208
  β”‚ β”‚ └── pipeline.py # Main pipeline orchestration
209
  β”‚ β”œβ”€β”€ utils/
 
14
 
15
  ## 🎯 Overview
16
 
17
+ This application provides both a REST API and an interactive Gradio interface for processing images of Tunisian vehicles to extract license plate numbers. The pipeline consists of four main stages:
18
 
19
+ 1. **Car Detection**: Uses a custom CNN trained from scratch to detect the vehicle region
20
+ 2. **License Plate Detection**: Uses YOLOv8n to detect and localize license plates within the car region
21
+ 3. **Word Detection**: Uses YOLOv8s to detect the Arabic word "ΨͺΩˆΩ†Ψ³" (Tunis) on the plate
22
+ 4. **Text Extraction**: Uses TrOCR (Microsoft's Transformer-based OCR) to extract the alphanumeric license plate text
23
 
24
  ## πŸ—οΈ Architecture
25
 
26
  ```
27
+ Input Image β†’ Car Detection (Custom CNN) β†’ Crop Car β†’
28
+ Plate Detection (YOLOv8n) β†’ Crop Plate β†’
29
  Word Detection (YOLOv8s) β†’ Mask Word β†’ OCR (TrOCR) β†’ Output Text
30
  ```
31
 
32
  ### Models Used
33
 
34
+ - **Car Detection**: `Safe-Drive-TN/Car-detection-from-scratch` (custom CNN)
35
  - **Plate Detection**: `Safe-Drive-TN/Tunisian-Licence-plate-Detection` (YOLOv8n)
36
  - **Word Detection**: `Safe-Drive-TN/tunis-word-detection-yolov8s` (YOLOv8s)
37
  - **OCR**: `microsoft/trocr-base-printed` (TrOCR)
 
76
 
77
  ## πŸ“‘ API Endpoints
78
 
79
+ ### 1. Detect Car
80
+ **POST** `/detect-car`
81
+
82
+ Detect and localize the vehicle region in an image.
83
+
84
+ **Response:**
85
+ ```json
86
+ {
87
+ "success": true,
88
+ "bbox": [x1, y1, x2, y2],
89
+ "confidence": 0.87
90
+ }
91
+ ```
92
+
93
+ ### 2. Complete Pipeline
94
  **POST** `/process`
95
 
96
  Process the full pipeline from image to extracted text.
 
113
  }
114
  ```
115
 
116
+ ### 3. Detect License Plate
117
  **POST** `/detect-plate`
118
 
119
  Detect and localize license plate in an image.
 
128
  }
129
  ```
130
 
131
+ ### 4. Detect Word
132
  **POST** `/detect-word`
133
 
134
  Detect "ΨͺΩˆΩ†Ψ³" word in a license plate image.
 
143
  }
144
  ```
145
 
146
+ ### 5. Extract Text
147
  **POST** `/extract-text`
148
 
149
  Extract text from a license plate image using OCR.
 
157
  }
158
  ```
159
 
160
+ ### 6. Health Check
161
  **GET** `/health`
162
 
163
  Check API health status.
 
173
 
174
  ### Detailed Mode
175
  - View all intermediate processing steps:
176
+ 1. Original image with detected car bounding box
177
+ 2. Cropped car region
178
+ 3. Car crop with detected license plate
179
+ 4. Cropped license plate
180
+ 5. Plate with detected word highlighted
181
+ 6. Final masked plate used for OCR
182
  - See confidence scores for each step
183
 
184
  ## πŸ“Š Dataset
 
204
 
205
  ```python
206
  # Model IDs
207
+ CAR_DETECTION_MODEL = "Safe-Drive-TN/Car-detection-from-scratch"
208
  PLATE_DETECTION_MODEL = "Safe-Drive-TN/Tunisian-Licence-plate-Detection"
209
  WORD_DETECTION_MODEL = "Safe-Drive-TN/tunis-word-detection-yolov8s"
210
  OCR_MODEL = "microsoft/trocr-base-printed"
211
 
212
  # Confidence Thresholds
213
+ CAR_DETECTION_CONFIDENCE = 0.6
214
  PLATE_DETECTION_CONFIDENCE = 0.25
215
  WORD_DETECTION_CONFIDENCE = 0.25
216
  OCR_CONFIDENCE_THRESHOLD = 0.5
 
224
  β”‚ β”œβ”€β”€ models/
225
  β”‚ β”‚ β”œβ”€β”€ plate_detector.py # YOLOv8n plate detection
226
  β”‚ β”‚ β”œβ”€β”€ word_detector.py # YOLOv8s word detection
227
+ β”‚ β”‚ β”œβ”€β”€ ocr_model.py # TrOCR text extraction
228
+ β”‚ β”‚ └── car_detector.py # Custom CNN car detection
229
  β”‚ β”œβ”€β”€ services/
230
  β”‚ β”‚ └── pipeline.py # Main pipeline orchestration
231
  β”‚ β”œβ”€β”€ utils/
example_usage.py DELETED
@@ -1,195 +0,0 @@
1
- """
2
- Example usage of the Tunisian License Plate Detection & OCR pipeline.
3
-
4
- This script demonstrates how to use the pipeline programmatically.
5
- """
6
- import cv2
7
- import sys
8
- from pathlib import Path
9
-
10
- from app.services.pipeline import get_pipeline
11
- from app.utils.image_processing import draw_bbox
12
-
13
-
14
- def process_single_image(image_path: str, show_visualization: bool = True):
15
- """
16
- Process a single image and display results.
17
-
18
- Args:
19
- image_path: Path to the image file
20
- show_visualization: Whether to show visualization
21
- """
22
- # Load image
23
- image = cv2.imread(image_path)
24
- if image is None:
25
- print(f"Error: Could not load image from {image_path}")
26
- return
27
-
28
- print(f"\n{'='*60}")
29
- print(f"Processing: {image_path}")
30
- print(f"{'='*60}\n")
31
-
32
- # Get pipeline
33
- print("Loading models...")
34
- pipeline = get_pipeline()
35
-
36
- # Process image
37
- print("Processing image...")
38
- result = pipeline.process_full_pipeline(image)
39
-
40
- # Display results
41
- if result['success']:
42
- print("βœ… SUCCESS!")
43
- print(f"\nπŸ“ Extracted Text: {result['text']}")
44
- print(f"\nπŸ“Š Confidence Scores:")
45
- print(f" - Plate Detection: {result['confidence']['plate_detection']:.2%}")
46
- print(f" - Word Detection: {result['confidence'].get('word_detection', 0):.2%}")
47
- print(f" - OCR: {result['confidence']['ocr']:.2%}")
48
- print(f" - Overall: {result['confidence']['overall']:.2%}")
49
-
50
- # Show visualization if requested
51
- if show_visualization:
52
- show_results(image, result)
53
- else:
54
- print("❌ FAILED!")
55
- print(f"Error: {result.get('error', 'Unknown error')}")
56
-
57
- print(f"\n{'='*60}\n")
58
-
59
-
60
- def show_results(original_image, result):
61
- """
62
- Display visualization of results.
63
-
64
- Args:
65
- original_image: Original input image
66
- result: Processing result dictionary
67
- """
68
- try:
69
- import matplotlib.pyplot as plt
70
-
71
- # Get intermediate results
72
- intermediate = result.get('intermediate_results', {})
73
-
74
- # Create figure with subplots
75
- fig, axes = plt.subplots(2, 2, figsize=(12, 10))
76
- fig.suptitle(f"License Plate: {result['text']}", fontsize=16, fontweight='bold')
77
-
78
- # Original image with plate bbox
79
- if 'plate_bbox' in intermediate:
80
- img_with_bbox = draw_bbox(
81
- original_image.copy(),
82
- intermediate['plate_bbox'],
83
- label=f"Conf: {result['confidence']['plate_detection']:.2f}",
84
- color=(0, 255, 0)
85
- )
86
- axes[0, 0].imshow(cv2.cvtColor(img_with_bbox, cv2.COLOR_BGR2RGB))
87
- axes[0, 0].set_title("1. Plate Detection")
88
- axes[0, 0].axis('off')
89
-
90
- # Cropped plate
91
- if 'plate_image' in intermediate:
92
- axes[0, 1].imshow(cv2.cvtColor(intermediate['plate_image'], cv2.COLOR_BGR2RGB))
93
- axes[0, 1].set_title("2. Cropped Plate")
94
- axes[0, 1].axis('off')
95
-
96
- # Plate with word detection
97
- if 'word_bbox' in intermediate and 'plate_image' in intermediate:
98
- plate_with_word = draw_bbox(
99
- intermediate['plate_image'].copy(),
100
- intermediate['word_bbox'],
101
- label=f"Conf: {result['confidence'].get('word_detection', 0):.2f}",
102
- color=(255, 0, 0)
103
- )
104
- axes[1, 0].imshow(cv2.cvtColor(plate_with_word, cv2.COLOR_BGR2RGB))
105
- axes[1, 0].set_title("3. Word Detection")
106
- axes[1, 0].axis('off')
107
-
108
- # Masked plate
109
- if 'masked_plate' in intermediate:
110
- axes[1, 1].imshow(cv2.cvtColor(intermediate['masked_plate'], cv2.COLOR_BGR2RGB))
111
- axes[1, 1].set_title("4. Masked for OCR")
112
- axes[1, 1].axis('off')
113
-
114
- plt.tight_layout()
115
- plt.show()
116
-
117
- except ImportError:
118
- print("\nNote: Install matplotlib to see visualizations")
119
- print("pip install matplotlib")
120
-
121
-
122
- def process_directory(directory_path: str):
123
- """
124
- Process all images in a directory.
125
-
126
- Args:
127
- directory_path: Path to directory containing images
128
- """
129
- directory = Path(directory_path)
130
-
131
- # Find all image files
132
- image_extensions = ['.jpg', '.jpeg', '.png', '.bmp']
133
- image_files = []
134
- for ext in image_extensions:
135
- image_files.extend(directory.glob(f'*{ext}'))
136
- image_files.extend(directory.glob(f'*{ext.upper()}'))
137
-
138
- if not image_files:
139
- print(f"No images found in {directory_path}")
140
- return
141
-
142
- print(f"\nFound {len(image_files)} images")
143
-
144
- # Process each image
145
- results = []
146
- for image_path in image_files:
147
- image = cv2.imread(str(image_path))
148
- if image is None:
149
- continue
150
-
151
- pipeline = get_pipeline()
152
- result = pipeline.process_full_pipeline(image)
153
-
154
- results.append({
155
- 'filename': image_path.name,
156
- 'success': result['success'],
157
- 'text': result.get('text', ''),
158
- 'confidence': result.get('confidence', {}).get('overall', 0)
159
- })
160
-
161
- status = "βœ…" if result['success'] else "❌"
162
- text = result.get('text', 'N/A')
163
- print(f"{status} {image_path.name}: {text}")
164
-
165
- # Summary
166
- successful = sum(1 for r in results if r['success'])
167
- print(f"\n{'='*60}")
168
- print(f"Summary: {successful}/{len(results)} images processed successfully")
169
- print(f"{'='*60}")
170
-
171
-
172
- def main():
173
- """Main function."""
174
- if len(sys.argv) < 2:
175
- print("Usage:")
176
- print(" python example_usage.py <image_path>")
177
- print(" python example_usage.py <directory_path> --batch")
178
- print("\nExamples:")
179
- print(" python example_usage.py samples/0.jpg")
180
- print(" python example_usage.py samples/ --batch")
181
- return
182
-
183
- path = sys.argv[1]
184
-
185
- if len(sys.argv) > 2 and sys.argv[2] == '--batch':
186
- # Process directory
187
- process_directory(path)
188
- else:
189
- # Process single image
190
- process_single_image(path, show_visualization=True)
191
-
192
-
193
- if __name__ == "__main__":
194
- main()
195
-