YOLO26X - ExecuTorch with XNNPACK (Dynamic Shapes)
YOLO26X exported to ExecuTorch .pte format with XNNPACK backend for accelerated CPU inference.
Model Details
- Base Model: Ultralytics YOLO26X-Object Detection
- Format: ExecuTorch (.pte)
- Backend: XNNPACK (CPU-optimized)
- Quantization: FP32
- File Size: 225.1 MB
Dynamic Shape Support
This model supports dynamic input shapes within the following constraints:
| Dimension | Min | Max | Constraint |
|---|---|---|---|
| Height | 320 | 8192 | Multiple of 32 |
| Width | 320 | 8192 | Multiple of 32 |
| Batch | 1 | 1 | Static |
Supported resolutions: 320×320, 640×640, 1280×1280, 2560×1440, 7680×4320 (8K), and any size that's a multiple of 32.
Usage
import torch
from executorch.runtime import Runtime
# Load the model
with open("yolo26x_dynamic_xnnpack.pte", "rb") as f:
pte_buffer = f.read()
runtime = Runtime.get()
program = runtime.load_program(pte_buffer)
method = program.load_method("forward")
# Run inference with different input sizes
for h, w in [(640, 640), (1280, 1280), (2560, 1440)]:
input_tensor = torch.randn(1, 3, h, w)
output = method.execute([input_tensor])
print(f"Input shape: {(h, w)}, Output shape: {output[0].shape}")
Model Architecture
YOLO26 is an end-to-end NMS-free object detector optimized for edge devices:
- End-to-end design (no NMS post-processing required)
- Up to 43% faster CPU inference than previous YOLO versions
- Optimized for mobile and edge deployment
Performance
Based on Ultralytics YOLO26 benchmarks:
| Metric | Value |
|---|---|
| Parameters | 98.9M |
| Input Size | 640×640 (training) |
| Inference | Supports 320-8192 px (multiples of 32) |
Tasks
This model performs object detection, outputting bounding boxes and class probabilities for 80 COCO classes.
Output shape: (1, 300, 6) where 300 is max detections and 6 is [x, y, w, h, confidence, class_id].
Troubleshooting
Low confidence / incorrect outputs with non-contiguous inputs
If your outputs look wrong (for object-detection models this can show up as all confidences capped around ~0.20 / 20% and no detections), ensure the input tensor passed to ExecuTorch is contiguous.
Example:
import torch
# img_hwc: float32 HWC image (e.g. RGB) in [0, 1]
x = torch.from_numpy(img_hwc).permute(2, 0, 1).unsqueeze(0) # NCHW (often non-contiguous)
x = x.contiguous() # IMPORTANT
outputs = method.execute([x])
Detection symptom example (before fix):
Confidence range: [0.0004, 0.2012]
Detections: 0
After fix (.contiguous()):
Confidence range: [0.0001, 0.9589]
Detections: 12
License
This model is released under AGPL-3.0 license. See Ultralytics YOLO26 for more details.
Credits
- Base Model: Ultralytics YOLO26
- Export Framework: ExecuTorch
- Backend: XNNPACK
Export Details
- ExecuTorch Version: Latest main branch
- Export Date: 2025-02-06
- Dynamic Shapes: Enabled (height/width: 320-8192, multiples of 32)
- Downloads last month
- -