YOLO26X - ExecuTorch with XNNPACK (Dynamic Shapes)

YOLO26X exported to ExecuTorch .pte format with XNNPACK backend for accelerated CPU inference.

Model Details

  • Base Model: Ultralytics YOLO26X-Object Detection
  • Format: ExecuTorch (.pte)
  • Backend: XNNPACK (CPU-optimized)
  • Quantization: FP32
  • File Size: 225.1 MB

Dynamic Shape Support

This model supports dynamic input shapes within the following constraints:

Dimension Min Max Constraint
Height 320 8192 Multiple of 32
Width 320 8192 Multiple of 32
Batch 1 1 Static

Supported resolutions: 320×320, 640×640, 1280×1280, 2560×1440, 7680×4320 (8K), and any size that's a multiple of 32.

Usage

import torch
from executorch.runtime import Runtime

# Load the model
with open("yolo26x_dynamic_xnnpack.pte", "rb") as f:
    pte_buffer = f.read()

runtime = Runtime.get()
program = runtime.load_program(pte_buffer)
method = program.load_method("forward")

# Run inference with different input sizes
for h, w in [(640, 640), (1280, 1280), (2560, 1440)]:
    input_tensor = torch.randn(1, 3, h, w)
    output = method.execute([input_tensor])
    print(f"Input shape: {(h, w)}, Output shape: {output[0].shape}")

Model Architecture

YOLO26 is an end-to-end NMS-free object detector optimized for edge devices:

  • End-to-end design (no NMS post-processing required)
  • Up to 43% faster CPU inference than previous YOLO versions
  • Optimized for mobile and edge deployment

Performance

Based on Ultralytics YOLO26 benchmarks:

Metric Value
Parameters 98.9M
Input Size 640×640 (training)
Inference Supports 320-8192 px (multiples of 32)

Tasks

This model performs object detection, outputting bounding boxes and class probabilities for 80 COCO classes.

Output shape: (1, 300, 6) where 300 is max detections and 6 is [x, y, w, h, confidence, class_id].

Troubleshooting

Low confidence / incorrect outputs with non-contiguous inputs

If your outputs look wrong (for object-detection models this can show up as all confidences capped around ~0.20 / 20% and no detections), ensure the input tensor passed to ExecuTorch is contiguous.

Example:

import torch

# img_hwc: float32 HWC image (e.g. RGB) in [0, 1]
x = torch.from_numpy(img_hwc).permute(2, 0, 1).unsqueeze(0)  # NCHW (often non-contiguous)
x = x.contiguous()  # IMPORTANT

outputs = method.execute([x])

Detection symptom example (before fix):

Confidence range: [0.0004, 0.2012]
Detections: 0

After fix (.contiguous()):

Confidence range: [0.0001, 0.9589]
Detections: 12

License

This model is released under AGPL-3.0 license. See Ultralytics YOLO26 for more details.

Credits

Export Details

  • ExecuTorch Version: Latest main branch
  • Export Date: 2025-02-06
  • Dynamic Shapes: Enabled (height/width: 320-8192, multiples of 32)
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support