Instructions to use canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Model Card: Gemma3-1B Turkish CPT LoRA (1st Epoch – Stage 1, 0K–50K Subset)
Overview
This model is a LoRA-adapted Turkish Continued Pretraining (CPT) variant of Gemma-3-1B.
Unlike the full-parameter CPT models trained in other stages of this project, this model performs parameter-efficient adaptation using Low-Rank Adaptation (LoRA). The base model weights remain frozen and only the LoRA adapter parameters are trained.
The model was trained on the first shard of the Turkish web corpus (samples 0–50,000).
Base model: google/gemma-3-1b-pt
Training method: LoRA-based continued pretraining
Dataset shard: 0–50K samples
Objective: domain adaptation to Turkish web text
Training Setup
Base Model: google/gemma-3-1b-pt
Dataset: canbingol/vngrs-web-corpus-200k
Subset Used: Samples 0–50,000
Training Objective: Continued Pretraining
Data Regime: Plain text
Epochs: 1
Token Count: ~21.6M tokens
LoRA Configuration
This model was trained using Low-Rank Adaptation (LoRA) with the following configuration.
r = 16
lora_alpha = 32
lora_dropout = 0.05
LoRA adapters were applied to the following transformer modules.
q_proj
k_proj
v_proj
o_proj
gate_proj
up_proj
down_proj
This configuration results in approximately 14.9M trainable parameters, which corresponds to roughly 0.48% of the full model parameters.
Training Notes
Only LoRA adapter weights were updated during training.
The base model parameters remain unchanged.
This model represents the first stage of LoRA-based Turkish CPT experiments for Gemma-3.
Usage Example
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "google/gemma-3-1b-pt"
lora_model = "canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(model, lora_model)
model = model.to(device)
prompt = "bundan böyle"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
**inputs,
max_new_tokens=50,
do_sample=True,
temperature=0.8,
top_p=0.9
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)