DeepSeek-MoE
Collection
DeepseekV3ForCausalLM • 3 items • Updated
Adopting BF16 & Imatrix from unsloth/DeepSeek-V3.1-Terminus-GGUF. (Huge fan of unsloth)
Personalized Replication of Low-Bit Mixed Precision Quant using --tensor-type option in llama.cpp
IQ1_S with more dynamic mix for extreme compression.
- IQ1_S : 137.66 GiB (1.76 BPW)
- IQ1_M : 151.25 GiB (1.94 BPW)
- Q2_K_L : 231.55 GiB (2.96 BPW)
- Q4_K_M : 376.89 GiB (4.82 BPW)
# !pip install huggingface_hub hf_transfer
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
from huggingface_hub import snapshot_download
snapshot_download(
repo_id = "bobchenyx/DeepSeek-V3.1-Terminus-GGUF",
local_dir = "bobchenyx/DeepSeek-V3.1-Terminus-GGUF",
allow_patterns = ["*IQ1_M*"], # Q2_K_L,Q4_K_M
)
1-bit
2-bit
4-bit
Base model
deepseek-ai/DeepSeek-V3.1-Base