BitMamba-2-255M
BitMamba-2-255M is the ultra-efficient baseline model of the BitMamba-2 family. It integrates 1.58-bit ternary quantization (BitNet) into the Mamba-2 architecture. Despite its small size, it demonstrates stable convergence and surprising reasoning capabilities, serving as the proof-of-concept for scaling ternary State Space Models.
β‘ Key Features
- Architecture: Mamba-2 SSM + BitNet b1.58 (Ternary Weights).
- Parameters: 255M.
- Precision: 1.58-bit (weights {-1, 0, 1}).
- Training Tokens: Trained on high-quality data (FineWeb-Edu, Cosmopedia, Stack-Dedup).
- Hardware: Trained on Google Cloud TPU v6e.
π Benchmark Results
This model serves as the baseline for our scaling laws analysis.
| Benchmark | Metric | BitMamba-2-255M |
|---|---|---|
| ARC-Easy | Accuracy | 55.51% |
| PIQA | Accuracy | 64.42% |
| BoolQ | Accuracy | 59.30% |
| HellaSwag | Acc Norm | 35.22% |
| WikiText-2 | Perplexity | 51.69 |
As shown in the scaling analysis below, the 255M model (blue line) establishes a stable learning trajectory, which is significantly improved upon by the 1B model (red line).
π Usage (Inference)
This model is optimized for extreme edge deployment (IoT, Mobile, Legacy Hardware) using our custom C++ inference engine.
1. Download the Quantized Model
Download the bitmamba_255m.bin file located in the files tab.
2. Run with C++
Go to our GitHub Repository to get the inference code.
# Example usage after compiling bitmamba.cpp
# Note: Using smaller context size for speed demonstration
./bitmamba bitmamba_255m.bin "15496 11 314 716" 0.7 1.1 0.05 0.9 40 200
3. JAX/Flax Usage
The bitmamba_255m.msgpack contains the raw JAX weights for research purposes. You can load them using the source code provided in src/ on GitHub.
π οΈ Efficient Deployment
Running on a consumer Intel Core i3-12100F CPU:
| Model | RAM Usage | Speed |
|---|---|---|
| BitMamba-2-255M | 252 MB | ~146 tok/s |
π Citation
If you use this model or our architecture, please cite our paper:
@misc{salazar2026bitmamba2,
author = {Salazar, Jesus},
title = {BitMamba-2: Efficient Scaling of 1.58-bit State Space Models},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.18394665},
url = {[https://doi.org/10.5281/zenodo.18394665](https://doi.org/10.5281/zenodo.18394665)}
}
- Downloads last month
- 3
