Qwen3-1.7B-SFT-UltraChat
This model is a fine-tuned version of Qwen/Qwen3-1.7B-Base trained on the HuggingFaceH4/ultrachat_200k dataset using Supervised Fine-Tuning (SFT) with LoRA adapters.
Overview
Qwen3-1.7B-SFT-UltraChat is an instruction-following language model optimized for conversational tasks. It combines the powerful 1.7B base model with high-quality instruction-following data from UltraChat, resulting in improved response quality and helpfulness.
Key Features
- High-Quality Fine-Tuning: Trained on 197,471 instruction-response pairs
- Efficient Training: Uses LoRA (Low-Rank Adaptation) for memory efficiency
- Strong Performance: Achieves 67.25% token accuracy on held-out evaluation set
- Optimized for Inference: Available in multiple formats including GGUF quantizations
Model Details
| Property | Value |
|---|---|
| Developed by | ermiaazarkhalili |
| License | CC-BY-NC-4.0 |
| Language | English |
| Base Model | Qwen/Qwen3-1.7B-Base |
| Model Size | 1.7B parameters |
| Tensor Type | BF16 |
| Context Length | 2,048 tokens |
| Training Method | SFT with LoRA |
Training Information
Training Configuration
| Parameter | Value |
|---|---|
| Learning Rate | 0.0002 |
| Batch Size | 8 per device |
| Effective Batch Size | 16 (with gradient accumulation) |
| Gradient Accumulation Steps | 2 |
| Number of Epochs | 1 |
| Max Sequence Length | 2,048 tokens |
| LR Scheduler | Linear warmup + Cosine annealing |
| Precision | BF16 mixed precision |
| Gradient Checkpointing | Enabled |
| Optimizer | AdamW |
LoRA Configuration
| Parameter | Value |
|---|---|
| LoRA Rank (r) | 32 |
| LoRA Alpha | 64 |
| LoRA Dropout | 0.05 |
| Target Modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
Training Metrics
| Metric | Value |
|---|---|
| Final Training Loss | 1.3051 |
| Final Eval Loss | 1.2908 |
| Token Accuracy | 67.25% |
| Training Time | 1d 3h 24m |
Training Hardware
- GPU: NVIDIA H100 80GB HBM3
- CPU: 8 vCPUs
- Memory: 64GB
- Platform: Compute Canada (Fir Cluster)
Dataset
This model was trained on the HuggingFaceH4/ultrachat_200k dataset:
| Split | Samples |
|---|---|
| Training | 197,471 |
| Evaluation | 10,394 |
The UltraChat dataset contains high-quality multi-turn conversations designed to improve instruction-following capabilities.
Usage
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "ermiaazarkhalili/Qwen3-1.7B-SFT-UltraChat"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Chat format
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What are the key principles of effective communication?"}
]
# Apply chat template
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
# Generate
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
Using Pipeline
from transformers import pipeline
generator = pipeline(
"text-generation",
model="ermiaazarkhalili/Qwen3-1.7B-SFT-UltraChat",
device_map="auto",
torch_dtype="auto"
)
messages = [{"role": "user", "content": "Explain quantum computing in simple terms."}]
output = generator(messages, max_new_tokens=256, return_full_text=False)
print(output[0]["generated_text"])
GGUF Versions
For CPU or mixed CPU/GPU inference, GGUF quantized versions are available at: ermiaazarkhalili/Qwen3-1.7B-SFT-UltraChat-GGUF
Available quantizations:
- Q4_K_M: Best balance of quality and size
- Q5_K_M: Higher quality, larger size
- Q8_0: Highest quality quantization
Using with Ollama
ollama pull hf.co/ermiaazarkhalili/Qwen3-1.7B-SFT-UltraChat-GGUF:Q4_K_M
ollama run hf.co/ermiaazarkhalili/Qwen3-1.7B-SFT-UltraChat-GGUF:Q4_K_M "Hello, how are you?"
Limitations
- Language: Primarily trained on English data; performance on other languages may vary
- Knowledge Cutoff: Base model knowledge is limited to its training data cutoff
- Hallucinations: Like all LLMs, may generate plausible-sounding but incorrect information
- Context Length: Limited to 2,048 tokens during fine-tuning
- Safety: Not extensively safety-tuned; use appropriate content filtering in production
Intended Use
Recommended Uses
- Conversational AI assistants
- Question answering systems
- Text generation and completion
- Educational applications
- Research and experimentation
Out-of-Scope Uses
- Medical, legal, or financial advice without expert oversight
- Generation of harmful, deceptive, or illegal content
- High-stakes decision-making without human verification
Citation
@misc{ermiaazarkhalili_qwen3_1.7b_sft_ultrachat,
author = {Ermia Azarkhalili},
title = {Qwen3-1.7B-SFT-UltraChat: Fine-tuned Qwen3-1.7B-Base on UltraChat},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/ermiaazarkhalili/Qwen3-1.7B-SFT-UltraChat}}
}
Acknowledgments
- Qwen Team for the excellent base model
- Hugging Face TRL Team for the training framework
- UltraChat Dataset creators
- Compute Canada for providing HPC resources
Contact
For questions, issues, or collaborations, please open an issue on the model repository or contact via HuggingFace.