SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations
Paper
•
2512.14080
•
Published
•
5
Large scale distributed AI model training, model parallelisation, low-level GPU acceleration, make GPUs go brrrrr