Fish Audio S2 Pro — FP8 (AEmotionStudio Mirror)
FP8 weight-only quantization of Fish Audio S2 Pro.
Details
| Property | Value |
|---|---|
| Source model | fishaudio/s2-pro |
| Quantization | Per-row symmetric FP8 (float8_e4m3fn) |
| Linear layers quantized | 201 |
| FP8 params | 4.048B |
| BF16 params | 0.514B |
| Model size | 4.73 GB |
| VRAM requirement | ~12 GB |
How it works
All nn.Linear weight matrices are quantized to float8_e4m3fn with per-row
float32 scale factors. Non-linear weights (embeddings, layer norms, codec) remain
in bfloat16. No external quantization library is needed — dequantization is pure PyTorch:
W_bf16 = W_fp8.to(torch.bfloat16) * scale
Usage with ComfyUI-FFMPEGA
This model is automatically downloaded and used by the ComfyUI-FFMPEGA extension for TTS and voice cloning features when FP8 precision is selected.
License
Fish Audio Research License — see LICENSE file.
- ✅ Free for research and non-commercial use
- ❌ Commercial use requires a separate license from Fish Audio (contact: business@fish.audio)
Built with Fish Audio.
- Downloads last month
- 73
Model tree for AEmotionStudio/fish-speech-s2-pro-fp8
Base model
AEmotionStudio/fish-speech-s2-pro