Turkish News Summarizer (IT5-Large)

This model has been trained on a corpus of 270,000+ real, non-synthetic, and exclusively human-written Turkish news articles and their corresponding summaries. The model is optimized to adhere to the morphological structure and formal register of the Turkish news language.

Dataset Specifications

  • Volume: 273,046 news articles and summaries.

  • Type: Authentic content derived from real-time news feeds, written by professional editors.

  • Domains: Politics, Economy, Sports, Technology, and General Agenda.

Training Parameters and Infrastructure

The training process was executed using high-end hardware and advanced optimization techniques:

  • Hardware: NVIDIA B200 (Blackwell) - 180GB VRAM.

  • Base Model: gsarti/it5-large (740M Parameters).

  • Precision: BF16 Mixed Precision & TF32 Core Acceleration.

  • Effective Batch Size: 160 (Per device batch: 80, Gradient Accumulation: 2).

  • Optimizer: AdamW (Fused Kernel).

  • Learning Rate: 4e-5 (with Cosine Decay Scheduler).

  • Process Optimization: Gradient Checkpointing enabled.

Training Metrics and Loss Curve

The following graph illustrates the convergence of the Cross-Entropy Loss during the training phase: Training Loss Curve

Inference Example

from transformers import T5TokenizerFast, AutoModelForSeq2SeqLM

tokenizer = T5TokenizerFast.from_pretrained("{REPO_NAME}")
model = AutoModelForSeq2SeqLM.from_pretrained("{REPO_NAME}")

def summarize_news(text):
    input_text = "summarize: " + text
    inputs = tokenizer(input_text, return_tensors="pt", max_length=1024, truncation=True)
    outputs = model.generate(
        inputs.input_ids, 
        max_length=150, 
        num_beams=4, 
        early_stopping=True
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

raw_text = "Insert news text here..."
print(summarize_news(raw_text))

License and Terms of Use

This model is released under the CC-BY-NC 4.0 license for research and development purposes. For commercial applications, the rights of the original data owners must be respected.

Downloads last month
17
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support