Bamboo-gemma3-12b-gguF

This is finetuned on Reverend Insanity Universe. After using it myself, Q8_0 is giving the best result, other quants are not as good. This repository contains GGUF quantized versions of a fine-tuned Gemma 3 12B model.

Model Details

  • Base Model: google/gemma-3-12b-it
  • Fine-tuning: Custom instruction tuning
  • Quantization: Multiple GGUF formats for different use cases

Available Quantizations

Quantization File Size Use Case
Q8_0 ~13 GB Highest quality, minimal loss
Q5_K_M ~8.5 GB Excellent quality/size balance (RECOMMENDED)
Q4_K_M ~7 GB Good quality, smaller size, most popular

Usage

With llama.cpp

# Download model
huggingface-cli download rihuwa/Bamboo-gemma3-12b-gguF gemma-3-12b-merged-Q5_K_M.gguf --local-dir ./models

# Run inference
./llama-cli -m ./models/gemma-3-12b-merged-Q5_K_M.gguf -p "Your prompt here" -n 256

With Ollama

# Create Modelfile
echo 'FROM ./gemma-3-12b-merged-Q5_K_M.gguf' > Modelfile

# Create model
ollama create my-gemma3-model -f Modelfile

# Run
ollama run my-gemma3-model

With LM Studio, GPT4All, etc.

Simply download the GGUF file and load it in your preferred application.

Quantization Details

  • Q8_0: 8-bit quantization, highest quality
  • Q5_K_M: 5-bit with K-quant method, medium size
  • Q4_K_M: 4-bit with K-quant method, smaller size

License

This model inherits the Gemma license from the base model.

Citation

@misc{Bamboo_gemma3_12b_gguF,
  author = {rihuwa},
  title = {Bamboo-gemma3-12b-gguF},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/rihuwa/Bamboo-gemma3-12b-gguF}
}
Downloads last month
27
GGUF
Model size
12B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rihuwa/GreenBamboo-v1-GGUF

Quantized
(145)
this model