NEMESIS
Superpatch-based 3D Medical Image Self-Supervised Pretraining via Noise-Enhanced Dual-Masking
IEEE AICAS 2026
Overview
NEMESIS is a self-supervised pretraining framework for 3D CT volumes using:
- Superpatch processing (128ยณ sub-volumes) โ memory-efficient ViT pretraining
- Dual-masking (MATB) โ plane-wise (xy) + axis-wise (z) masking, exploiting CT anisotropy
- NEMESIS Tokens (NTs) โ learnable tokens summarising visible patches via cross-attention
- Noise-enhanced reconstruction โ Gaussian noise injection for regularisation
Key result (BTCV organ classification, frozen linear probe)
| Method | AUROC |
|---|---|
| NEMESIS (frozen) | 0.9633 |
| SuPreM (fine-tuned) | 0.9493 |
| VoCo (fine-tuned) | 0.9387 |
Checkpoints
| File | embed_dim | depth | mask_ratio |
|---|---|---|---|
MAE_768_0.5.pt |
768 | 6 | 0.5 |
MAE_768_0.25.pt |
768 | 6 | 0.25 |
MAE_768_0.75.pt |
768 | 6 | 0.75 |
MAE_576_0.5.pt |
576 | 6 | 0.5 |
MAE_384_0.5.pt |
384 | 6 | 0.5 |
| (others) |
Usage
pip install huggingface_hub
huggingface-cli download whilethis/NEMESIS MAE_768_0.5.pt --local-dir pretrained/
import torch
from nemesis.models.mae import MAEgic3DMAE
ckpt = torch.load("pretrained/MAE_768_0.5.pt", map_location="cpu")
model = MAEgic3DMAE(
embed_dim=768, depth=6, num_heads=8,
decoder_embed_dim=128, decoder_depth=3,
num_maegic_tokens=8,
)
model.load_state_dict(ckpt["model_state_dict"])
encoder = model.encoder
Code
https://github.com/whilethis00/NEMESIS-public
Citation
@inproceedings{jung2026nemesis,
title = {{NEMESIS}: Superpatch-based 3{D} Medical Image Self-Supervised Pretraining via Noise-Enhanced Dual-Masking},
author = {Jung, Hyeonseok and others},
booktitle = {IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)},
year = {2026},
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support