Bamboo-gemma3-12b-gguF

This is finetuned on Reverend Insanity Universe. After using it myself, Q8_0 is giving the best result, other quants are not as good. This repository contains GGUF quantized versions of a fine-tuned Gemma 3 12B model.

Model Details

Base Model: google/gemma-3-12b-it
Fine-tuning: Custom instruction tuning
Quantization: Multiple GGUF formats for different use cases

Available Quantizations

Quantization	File Size	Use Case
Q8_0	~13 GB	Highest quality, minimal loss
Q5_K_M	~8.5 GB	Excellent quality/size balance (RECOMMENDED)
Q4_K_M	~7 GB	Good quality, smaller size, most popular

Usage

With llama.cpp

# Download model
huggingface-cli download rihuwa/Bamboo-gemma3-12b-gguF gemma-3-12b-merged-Q5_K_M.gguf --local-dir ./models

# Run inference
./llama-cli -m ./models/gemma-3-12b-merged-Q5_K_M.gguf -p "Your prompt here" -n 256

With Ollama

# Create Modelfile
echo 'FROM ./gemma-3-12b-merged-Q5_K_M.gguf' > Modelfile

# Create model
ollama create my-gemma3-model -f Modelfile

# Run
ollama run my-gemma3-model

With LM Studio, GPT4All, etc.

Simply download the GGUF file and load it in your preferred application.

Quantization Details

Q8_0: 8-bit quantization, highest quality
Q5_K_M: 5-bit with K-quant method, medium size
Q4_K_M: 4-bit with K-quant method, smaller size

License

This model inherits the Gemma license from the base model.

Citation

@misc{Bamboo_gemma3_12b_gguF,
  author = {rihuwa},
  title = {Bamboo-gemma3-12b-gguF},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/rihuwa/Bamboo-gemma3-12b-gguF}
}

Downloads last month: 27

GGUF

Model size

12B params

Architecture

gemma3

Hardware compatibility

4-bit

5-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rihuwa/GreenBamboo-v1-GGUF

Base model

google/gemma-3-12b-pt

Finetuned

google/gemma-3-12b-it

Quantized

(145)

this model