Image Classification

MobileNet v1

Use case : Image classification

Model description

MobileNet is a well known architecture that can be used in multiple use cases. Input size and width factor called alpha are parameters to be adapted to various use cases complexity. The alpha parameter is used to increase or decrease the number of filters in each layer, allowing also to reduce the number of multiply-adds and then the inference time.

The original paper demonstrates the performance of MobileNet models using alpha values of 1.0, 0.75, 0.5 and 0.25.

(source: https://keras.io/api/applications/mobilenet/)

The model is quantized in int8 using tensorflow lite converter.

Network information

Network Information Value
Framework TensorFlow Lite
MParams alpha=1.0 1.3 M
Quantization int8
Provenance https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet
Paper https://arxiv.org/abs/1704.04861

The models are quantized using tensorflow lite converter.

Network inputs / outputs

For an image resolution of NxM and P classes

Input Shape Description
(1, N, M, 3) Single NxM RGB image with UINT8 values between 0 and 255
Output Shape Description
(1, P) Per-class confidence for P classes in FLOAT32

Recommended platforms

Platform Supported Recommended
STM32L0 [] []
STM32L4 [x] []
STM32U5 [x] []
STM32H7 [x] [x]
STM32MP1 [x] [x]
STM32MP2 [x] [x]
STM32N6 [x] [x]

Performances

Metrics

  • Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.
  • tfs stands for "training from scratch", meaning that the model weights were randomly initialized before training.
  • tl stands for "transfer learning", meaning that the model backbone weights were initialized from a pre-trained model, then only the last layer was unfrozen during the training.
  • fft stands for "full fine-tuning", meaning that the full model weights were initialized from a transfer learning pre-trained model, and all the layers were unfrozen during the training.

Reference NPU memory footprint on food-101 and ImageNet dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Series Internal RAM External RAM Weights Flash STM32Cube.AI version STEdgeAI Core version
MobileNet v1 0.25 fft food-101 Int8 224x224x3 STM32N6 588 0.0 304.72 10.2.0 2.2.0
MobileNet v1 0.5 fft food-101 Int8 224x224x3 STM32N6 588 0.0 992.67 10.2.0 2.2.0
MobileNet v1 1.0 fft food-101 Int8 224x224x3 STM32N6 1568 0.0 3602.97 10.2.0 2.2.0
MobileNet v1 0.25 ImageNet Int8 224x224x3 STM32N6 588 0.0 533.38 10.2.0 2.2.0
MobileNet v1 0.5 ImageNet Int8 224x224x3 STM32N6 588 0.0 1446.06 10.2.0 2.2.0
MobileNet v1 1.0 ImageNet Int8 224x224x3 STM32N6 1568 0.0 4505.86 10.2.0 2.2.0

Reference NPU inference time on food-101 and ImageNet dataset (see Accuracy for details on dataset)

Model Dataset Format Resolution Board Execution Engine Inference time (ms) Inf / sec STM32Cube.AI version STEdgeAI Core version
MobileNet v1 0.25 fft food-101 Int8 224x224x3 STM32N6570-DK NPU/MCU 2.81 355.87 10.2.0 2.2.0
MobileNet v1 0.5 fft food-101 Int8 224x224x3 STM32N6570-DK NPU/MCU 6.03 165.83 10.2.0 2.2.0
MobileNet v1 1.0 fft food-101 Int8 224x224x3 STM32N6570-DK NPU/MCU 16.79 59.55 10.2.0 2.2.0
MobileNet v1 0.25 Imagenet Int8 224x224x3 STM32N6570-DK NPU/MCU 3.56 280.89 10.2.0 2.2.0
MobileNet v1 0.5 Imagenet Int8 224x224x3 STM32N6570-DK NPU/MCU 7.35 136.05 10.2.0 2.2.0
MobileNet v1 1.0 Imagenet Int8 224x224x3 STM32N6570-DK NPU/MCU 19.26 51.93 10.2.0 2.2.0

Reference MCU memory footprint based on Flowers dataset and ImageNet dataset (see Accuracy for details on dataset)

Model Format Resolution Series Activation RAM Runtime RAM Weights Flash Code Flash Total RAM Total Flash STM32Cube.AI version
MobileNet v1 0.25 fft Int8 224x224x3 STM32H7 272.96 KiB 16.38 KiB 214.69 KiB 67.24 KiB 289.34 KiB 281.93 KiB 10.2.0
MobileNet v1 0.5 fft Int8 224x224x3 STM32H7 449.58 KiB 16.38 KiB 812.61 KiB 80.61 KiB 465.96 KiB 893.22 KiB 10.2.0
MobileNet v1 0.25 fft Int8 96x96x3 STM32H7 66.96 KiB 16.33 KiB 214.69 KiB 67.19 KiB 83.29 KiB 281.88 KiB 10.2.0
MobileNet v1 0.25 tfs Int8 96x96x1 STM32H7 52.8 KiB 16.33 KiB 214.55 KiB 69.28 KiB 69.13 KiB 283.83 KiB 10.2.0
MobileNet v1 0.25 Int8 224x224x3 STM32H7 267.2 KiB 16.44 KiB 467.33 KiB 68.37 KiB 283.64 KiB 535.7 KiB 10.2.0
MobileNet v1 0.5 Int8 224x224x3 STM32H7 404.28 KiB 16.44 KiB 1314 KiB 81.72 KiB 447.51 KiB 1395.72 KiB 10.2.0
MobileNet v1 1.0 Int8 224x224x3 STM32H7 1331.13 KiB 16.48 KiB 4157.09 KiB 108.46 KiB 1347.61 KiB 4265.55 KiB 10.2.0

Reference MCU inference time based on Flowers dataset and ImageNet dataset (see Accuracy for details on dataset)

Model Format Resolution Board Execution Engine Frequency Inference time (ms) STM32Cube.AI version
MobileNet v1 0.25 fft Int8 224x224x3 STM32H747I-DISCO 1 CPU 400 MHz 166.9 ms 10.2.0
MobileNet v1 0.5 fft Int8 224x224x3 STM32H747I-DISCO 1 CPU 400 MHz 471.68 ms 10.2.0
MobileNet v1 0.25 fft Int8 96x96x3 STM32H747I-DISCO 1 CPU 400 MHz 30.63 ms 10.2.0
MobileNet v1 0.25 tfs Int8 96x96x1 STM32H747I-DISCO 1 CPU 400 MHz 29.04 ms 10.2.0
MobileNet v1 0.25 Int8 224x224x3 STM32H747I-DISCO 1 CPU 400 MHz 170.37 ms 10.2.0
MobileNet v1 0.5 Int8 224x224x3 STM32H747I-DISCO 1 CPU 400 MHz 477.79 ms 10.2.0
MobileNet v1 1.0 Int8 224x224x3 STM32H747I-DISCO 1 CPU 400 MHz 1656.41 ms 10.2.0

Reference MPU inference time based on Flowers dataset (see Accuracy for details on dataset)

Model Format Resolution Quantization Board Execution Engine Frequency Inference time (ms) %NPU %GPU %CPU X-LINUX-AI version Framework
MobileNet v1 0.25 fft Int8 224x224x3 per-channel** STM32MP257F-DK2 NPU/GPU 800 MHz 14.27 ms 7.54 92.46 0 v6.1.0 OpenVX
MobileNet v1 0.5 fft Int8 224x224x3 per-channel** STM32MP257F-DK2 NPU/GPU 800 MHz 32.79 ms 3.83 96.17 0 v6.1.0 OpenVX
MobileNet v1 0.25 fft Int8 96x96x3 per-channel** STM32MP257F-DK2 NPU/GPU 800 MHz 3.81 ms 15.36 84.64 0 v6.1.0 OpenVX
MobileNet v1 0.25 tfs Int8 96x96x1 per-channel** STM32MP257F-DK2 NPU/GPU 800 MHz 3.66 ms 13.91 86.09 0 v6.1.0 OpenVX
MobileNet v1 0.25 fft Int8 224x224x3 per-channel STM32MP157F-DK2 2 CPU 800 MHz 33.91ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0
MobileNet v1 0.5 fft Int8 224x224x3 per-channel STM32MP157F-DK2 2 CPU 800 MHz 90.6 ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0
MobileNet v1 0.25 fft Int8 96x96x3 per-channel STM32MP157F-DK2 2 CPU 800 MHz 6.32 ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0
MobileNet v1 0.25 tfs Int8 96x96x1 per-channel STM32MP157F-DK2 2 CPU 800 MHz 5.83 ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0
MobileNet v1 0.25 fft Int8 224x224x3 per-channel STM32MP135F-DK2 1 CPU 1000 MHz 52.39 ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0
MobileNet v1 0.5 fft Int8 224x224x3 per-channel STM32MP135F-DK2 1 CPU 1000 MHz 144.47 ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0
MobileNet v1 0.25 fft Int8 96x96x3 per-channel STM32MP135F-DK2 1 CPU 1000 MHz 9.31 ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0
MobileNet v1 0.25 tfs Int8 96x96x1 per-channel STM32MP135F-DK2 1 CPU 1000 MHz 9.37 ms NA NA 100 v6.1.0 TensorFlowLite 2.18.0

** To get the most out of MP25 NPU hardware acceleration, please use per-tensor quantization

Accuracy with Flowers dataset

Dataset details: link , License CC BY 2.0 , Quotation[1] , Number of classes: 5, Number of images: 3 670

Model Format Resolution Top 1 Accuracy
MobileNet v1 0.25 tfs Float 224x224x3 88.83 %
MobileNet v1 0.25 tfs Int8 224x224x3 89.37 %
MobileNet v1 0.25 tl Float 224x224x3 85.83 %
MobileNet v1 0.25 tl Int8 224x224x3 83.24 %
MobileNet v1 0.25 fft Float 224x224x3 93.05 %
MobileNet v1 0.25 fft Int8 224x224x3 92.1 %
MobileNet v1 0.5 tfs Float 224x224x3 92.1 %
MobileNet v1 0.5 tfs Int8 224x224x3 91.55 %
MobileNet v1 0.5 tl Float 224x224x3 88.56 %
MobileNet v1 0.5 tl Int8 224x224x3 87.74 %
MobileNet v1 0.5 fft Float 224x224x3 95.1 %
MobileNet v1 0.5 fft Int8 224x224x3 94.41 %
MobileNet v1 0.25 fft Float 96x96x3 87.47 %
MobileNet v1 0.25 fft Int8 96x96x3 87.06 %
MobileNet v1 0.25 tfs Float 96x96x1 74.93 %
MobileNet v1 0.25 tfs Int8 96x96x1 74.93 %

Accuracy with Plant-village dataset

Dataset details: link , License CC0 1.0, Quotation[2] , Number of classes: 39, Number of images: 61 486

Model Format Resolution Top 1 Accuracy
MobileNet v1 0.25 tfs Float 224x224x3 99.92 %
MobileNet v1 0.25 tfs Int8 224x224x3 99.92 %
MobileNet v1 0.25 tl Float 224x224x3 85.38 %
MobileNet v1 0.25 tl Int8 224x224x3 83.7 %
MobileNet v1 0.25 fft Float 224x224x3 99.95 %
MobileNet v1 0.25 fft Int8 224x224x3 99.82 %
MobileNet v1 0.5 tfs Float 224x224x3 99.9 %
MobileNet v1 0.5 tfs Int8 224x224x3 99.83 %
MobileNet v1 0.5 tl Float 224x224x3 93.05 %
MobileNet v1 0.5 tl Int8 224x224x3 92.7 %
MobileNet v1 0.5 fft Float 224x224x3 99.94 %
MobileNet v1 0.5 fft Int8 224x224x3 99.85 %

Accuracy with Food-101 dataset

Dataset details: link, Quotation[3] , Number of classes: 101 , Number of images: 101 000

Model Format Resolution Top 1 Accuracy
MobileNet v1 0.25 tfs Float 224x224x3 72.16 %
MobileNet v1 0.25 tfs Int8 224x224x3 71.13 %
MobileNet v1 0.25 tl Float 224x224x3 43.21 %
MobileNet v1 0.25 tl Int8 224x224x3 39.89 %
MobileNet v1 0.25 fft Float 224x224x3 72.36 %
MobileNet v1 0.25 fft Int8 224x224x3 69.52 %
MobileNet v1 0.5 tfs Float 224x224x3 76.97 %
MobileNet v1 0.5 tfs Int8 224x224x3 76.37 %
MobileNet v1 0.5 tl Float 224x224x3 48.78 %
MobileNet v1 0.5 tl Int8 224x224x3 45.89 %
MobileNet v1 0.5 fft Float 224x224x3 76.72 %
MobileNet v1 0.5 fft Int8 224x224x3 74.82 %
MobileNet v1 1.0 fft Float 224x224x3 80.38 %
MobileNet v1 1.0 fft Int8 224x224x3 79.43 %

Accuracy with ImageNet dataset

Dataset details: link, Quotation[4]. Number of classes: 1000. To perform the quantization, we calibrated the activations with a random subset of the training set. For the sake of simplicity, the accuracy reported here was estimated on the 50000 labelled images of the validation set.

model Format Resolution Top 1 Accuracy
MobileNet v1 0.25 Float 224x224x3 48.96 %
MobileNet v1 0.25 Int8 224x224x3 46.34 %
MobileNet v1 0.5 Float 224x224x3 62.11 %
MobileNet v1 0.5 Int8 224x224x3 59.92 %
MobileNet v1 1.0 Float 224x224x3 69.52 %
MobileNet v1 1.0 Int8 224x224x3 68.64 %

Retraining and Integration in a simple example:

Please refer to the stm32ai-modelzoo-services GitHub here

References

[1] "Tf_flowers : tensorflow datasets," TensorFlow. [Online]. Available: https://www.tensorflow.org/datasets/catalog/tf_flowers.

[2] J, ARUN PANDIAN; GOPAL, GEETHARAMANI (2019), "Data for: Identification of Plant Leaf Diseases Using a 9-layer Deep Convolutional Neural Network", Mendeley Data, V1, doi: 10.17632/tywbtsjrjv.1

[3] L. Bossard, M. Guillaumin, and L. Van Gool, "Food-101 -- Mining Discriminative Components with Random Forests." European Conference on Computer Vision, 2014.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support