Bamboo-gemma3-12b-gguF
This is finetuned on Reverend Insanity Universe. After using it myself, Q8_0 is giving the best result, other quants are not as good. This repository contains GGUF quantized versions of a fine-tuned Gemma 3 12B model.
Model Details
- Base Model: google/gemma-3-12b-it
- Fine-tuning: Custom instruction tuning
- Quantization: Multiple GGUF formats for different use cases
Available Quantizations
| Quantization | File Size | Use Case |
|---|---|---|
| Q8_0 | ~13 GB | Highest quality, minimal loss |
| Q5_K_M | ~8.5 GB | Excellent quality/size balance (RECOMMENDED) |
| Q4_K_M | ~7 GB | Good quality, smaller size, most popular |
Usage
With llama.cpp
# Download model
huggingface-cli download rihuwa/Bamboo-gemma3-12b-gguF gemma-3-12b-merged-Q5_K_M.gguf --local-dir ./models
# Run inference
./llama-cli -m ./models/gemma-3-12b-merged-Q5_K_M.gguf -p "Your prompt here" -n 256
With Ollama
# Create Modelfile
echo 'FROM ./gemma-3-12b-merged-Q5_K_M.gguf' > Modelfile
# Create model
ollama create my-gemma3-model -f Modelfile
# Run
ollama run my-gemma3-model
With LM Studio, GPT4All, etc.
Simply download the GGUF file and load it in your preferred application.
Quantization Details
- Q8_0: 8-bit quantization, highest quality
- Q5_K_M: 5-bit with K-quant method, medium size
- Q4_K_M: 4-bit with K-quant method, smaller size
License
This model inherits the Gemma license from the base model.
Citation
@misc{Bamboo_gemma3_12b_gguF,
author = {rihuwa},
title = {Bamboo-gemma3-12b-gguF},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/rihuwa/Bamboo-gemma3-12b-gguF}
}
- Downloads last month
- 27
Hardware compatibility
Log In to add your hardware
4-bit
5-bit
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support