Instructions to use synquid/gemma-3-4b-dolci-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use synquid/gemma-3-4b-dolci-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="synquid/gemma-3-4b-dolci-sft")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("synquid/gemma-3-4b-dolci-sft")
model = AutoModelForImageTextToText.from_pretrained("synquid/gemma-3-4b-dolci-sft")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

HERMES

How to use synquid/gemma-3-4b-dolci-sft with HERMES:

# No code snippets available yet for this library.

# To use this model, check the repository files and the library's documentation.

# Want to help? PRs adding snippets are welcome at:
# https://github.com/huggingface/huggingface.js

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use synquid/gemma-3-4b-dolci-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "synquid/gemma-3-4b-dolci-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "synquid/gemma-3-4b-dolci-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/synquid/gemma-3-4b-dolci-sft

SGLang

How to use synquid/gemma-3-4b-dolci-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "synquid/gemma-3-4b-dolci-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "synquid/gemma-3-4b-dolci-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "synquid/gemma-3-4b-dolci-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "synquid/gemma-3-4b-dolci-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use synquid/gemma-3-4b-dolci-sft with Docker Model Runner:
```
docker model run hf.co/synquid/gemma-3-4b-dolci-sft
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Internal Model Release: Gemma 3 4B Dolci SFT Instruct Alignment-Free (LoRA Merged)

Summary

This model is Gemma 3 4B fine-tuned on Dolci SFT Instruct data (derived from OLMo 3 SFT data), using an alignment-filtered variant intended to remove low-quality alignment data. The release is a merged full-weight model produced from a LoRA adapter.

Tool-calling is an important part of the training mix, but it is not the primary objective; the primary objective is broad instruction tuning on the Dolci SFT Instruct alignment-free corpus.

Data and Curation

Data lineage: Dolci SFT Instruct derived from OLMo 3 SFT data.
Curation goal: alignment-filtered (alignment-free) training subset.
Effective training samples: 51,476.
Preprocessing format: chat_tool_calls_v5_hermes_with_im_end.
Chat style: ChatML with Hermes-style tool-call representation.
Overlength policy: trim oldest turns.
Samples with unparsed tool calls were dropped.

Tool-Calling Formatting

Tool interactions are represented with XML markers in assistant/tool turns:

<tool_call> ... </tool_call>
<tool_response> ... </tool_response>

Special token behavior validated after merge:

<|im_start|> -> 105
<|im_end|> -> 106
<tool_call> -> 8
</tool_call> -> 9

EOS/PAD behavior used in training:

eos_token = <|im_end|>
pad_token = <|im_end|>

Training Setup

Base model: google/gemma-3-4b-pt.
Distributed setup: 4 nodes x 8 GPUs (32 GPUs total).
Precision: bf16.
Sequence length: 32,768.
Epochs: 1.0.
Per-device batch size: 2.
Gradient accumulation: 1.
Learning rate: 3e-4.
Checkpoint interval: 500 steps.
Gradient checkpointing enabled.
Liger kernel enabled.

LoRA Configuration

Method: PEFT LoRA (peft 0.18.1).
Rank: r = 64.
Alpha: 32.
Dropout: 0.05.
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj.
Excluded modules: vision tower modules.

Training Outcomes

Train loss: 0.7651.
Train runtime: 65,354s.
Train samples/sec: 0.788.
Train steps/sec: 0.012.
Total FLOPs: 3.79e19.
Approximate parameter count observed in run telemetry: 4,419,128,176.

Merge Details

Adapter was merged into base weights to produce this full model.
Output dtype: bf16.
Merge executed on CPU (safe_serialization enabled).
Sharded save with 5GB shard target.
Merge-time stack:
- torch 2.9.1+rocm6.4
- transformers 4.57.3
- peft 0.18.1

Compatibility adjustment applied during merge:

Tokenizer config was sanitized for current Gemma fast-tokenizer loading behavior (an extra_special_tokens list field was removed).

License and Terms

Use remains subject to the upstream google/gemma-3-4b-pt license and terms.

Downloads last month: 2

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for synquid/gemma-3-4b-dolci-sft

Base model

google/gemma-3-4b-pt

Finetuned

(287)

this model