Object Detection

SSDLite MobileNetV3Large

Use case : Object detection

Model description

SSDLite MobileNetV3Large is a high-capacity lightweight single-shot object detection model optimized for real-time inference on mobile and edge devices while providing improved accuracy compared to standard SSDLite MobileNetV2.

It combines the SSDLite framework with MobileNetV3Large as the backbone. MobileNetV3Large uses inverted residual blocks with expanded width, providing stronger representational power while maintaining the efficiency benefits of MobileNetV3.
The SSDLite detection head predicts object locations and class probabilities in a single forward pass, making the model suitable for real-time detection on resource-constrained platforms, especially when higher accuracy is needed.

The ssdlite_mobilenetv3large_pt variant is implemented in PyTorch and is used in applications where low latency, reasonable memory footprint, and better accuracy are desired for edge and mobile deployments.

Network information

Network information Value
Framework Torch
Quantization Int8
Provenance torchvision GitHub
Paper SSDLite
MobileNetV3

The model is quantized to int8 using ONNX Runtime and exported for efficient deployment.

Network inputs / outputs

For an image resolution of NxM and NC classes

Input Shape Description
(1, W, H, 3) Single NxM RGB image with UINT8 values between 0 and 255
Output Shape Description
(1, 3000,(1+NC) and (1,3000,4)) Model returns two output vectors of bounding boxes where first output returns confidence for each class (+ background class) and second output returns bounding box coordinates (x1, y1, x2,y2)

Recommended Platforms

Platform Supported Recommended
STM32L0 [] []
STM32L4 [] []
STM32U5 [] []
STM32H7 [] []
STM32MP1 [] []
STM32MP2 [] []
STM32N6 [x] [x]

Performances

Metrics

Measures are done with default STEdgeAI Core configuration with enabled input / output allocated option.

Reference NPU memory footprint based on COCO dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Series Internal RAM (KiB) External RAM (KiB) Weights Flash (KiB) STEdgeAI Core version
ssdlite_mobilenetv3large_pt COCO Int8 300x300x3 STM32N6 2484.27 0 3592.83 3.0.0

Reference NPU inference time based on COCO dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Board Execution Engine Inference time (ms) Inf / sec STEdgeAI Core version
ssdlite_mobilenetv3large_pt COCO Int8 300x300x3 STM32N6570-DK NPU/MCU 34.62 28.89 3.0.0

Reference NPU memory footprint based on COCO Person dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Series Internal RAM (KiB) External RAM (KiB) Weights Flash (KiB) STEdgeAI Core version
ssdlite_mobilenetv3large_pt COCO-Person Int8 300x300x3 STM32N6 2247.37 0 2592.98 3.0.0

Reference NPU inference time based on COCO Person dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Board Execution Engine Inference time (ms) Inf / sec STEdgeAI Core version
ssdlite_mobilenetv3large_pt COCO-Person Int8 300x300x3 STM32N6570-DK NPU/MCU 31.45 31.80 3.0.0

Reference NPU memory footprint based on VOC dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Series Internal RAM (KiB) External RAM (KiB) Weights Flash (KiB) STEdgeAI Core version
ssdlite_mobilenetv3large_pt VOC Int8 300x300x3 STM32N6 2242.98 0 2833.48 3.0.0

Reference NPU inference time based on VOC dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Board Execution Engine Inference time (ms) Inf / sec STEdgeAI Core version
ssdlite_mobilenetv3large_pt VOC Int8 300x300x3 STM32N6570-DK NPU/MCU 32.16 31.09 3.0.0

AP on COCO dataset

Dataset details: link , License CC BY 4.0, Number of classes: 80

Model Format Resolution AP50
ssdlite_mobilenetv3large_pt Float 3x300x300 27.72
ssdlite_mobilenetv3large_pt Int8 3x300x300 26.78

* EVAL_IOU = 0.5, NMS_THRESH = 0.5, SCORE_THRESH = 0.001, MAX_DETECTIONS = 100

AP on COCO-Person dataset

Dataset details: link , License CC BY 4.0 , Number of classes: 1

Model Format Resolution AP50
ssdlite_mobilenetv3large_pt Float 3x300x300 39.24
ssdlite_mobilenetv3large_pt Int8 3x300x300 38.20

* EVAL_IOU = 0.5, NMS_THRESH = 0.5, SCORE_THRESH = 0.001, MAX_DETECTIONS = 100

AP on VOC dataset

Dataset details: link , License , Number of classes: 20

Model Format Resolution AP50
ssdlite_mobilenetv3large_pt Float 3x300x300 65.84
ssdlite_mobilenetv3large_pt Int8 3x300x300 65.49

* EVAL_IOU = 0.5, NMS_THRESH = 0.5, SCORE_THRESH = 0.001, MAX_DETECTIONS = 100

Retraining and Integration in a simple example:

Please refer to the stm32ai-modelzoo-services GitHub here

References

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for STMicroelectronics/ssdlite_mobilenetv3large_pt