Model Card: Gemma3-1B Turkish CPT LoRA (1st Epoch – Stage 1, 0K–50K Subset)

Overview

This model is a LoRA-adapted Turkish Continued Pretraining (CPT) variant of Gemma-3-1B.

Unlike the full-parameter CPT models trained in other stages of this project, this model performs parameter-efficient adaptation using Low-Rank Adaptation (LoRA). The base model weights remain frozen and only the LoRA adapter parameters are trained.

The model was trained on the first shard of the Turkish web corpus (samples 0–50,000).

Base model: google/gemma-3-1b-pt
Training method: LoRA-based continued pretraining
Dataset shard: 0–50K samples
Objective: domain adaptation to Turkish web text

Training Setup

Base Model: google/gemma-3-1b-pt
Dataset: canbingol/vngrs-web-corpus-200k
Subset Used: Samples 0–50,000
Training Objective: Continued Pretraining
Data Regime: Plain text
Epochs: 1
Token Count: ~21.6M tokens

LoRA Configuration

This model was trained using Low-Rank Adaptation (LoRA) with the following configuration.

r = 16
lora_alpha = 32
lora_dropout = 0.05

LoRA adapters were applied to the following transformer modules.

q_proj
k_proj
v_proj
o_proj
gate_proj
up_proj
down_proj

This configuration results in approximately 14.9M trainable parameters, which corresponds to roughly 0.48% of the full model parameters.

Training Notes

Only LoRA adapter weights were updated during training.
The base model parameters remain unchanged.

This model represents the first stage of LoRA-based Turkish CPT experiments for Gemma-3.

Usage Example

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "google/gemma-3-1b-pt"
lora_model = "canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1"

device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16
)

model = PeftModel.from_pretrained(model, lora_model)
model = model.to(device)

prompt = "bundan böyle"
inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = model.generate(
    **inputs,
    max_new_tokens=50,
    do_sample=True,
    temperature=0.8,
    top_p=0.9
)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1

Base model

google/gemma-3-1b-pt

Adapter

(54)

this model

Adapters

1 model

Dataset used to train canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1

Collection including canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1

Gemma3 CPT

Collection

9 items • Updated Apr 8