Llama-Krikri-8B for Ancient Greek to Modern Greek (QLoRA)

This model is a fine-tuned version of ilsp/Llama-Krikri-8B-Instruct for translating Ancient Greek to Modern Greek.

It was fine-tuned using QLoRA on the sentence-level AG-MG Parallel Corpus.

This model was trained by Spyridon Mavromatis at the Institute for Language and Speech Processing (ILSP), "Athena" RC, and the National and Kapodistrian University of Athens (NKUA) as part of an M.Sc. thesis.

Built with Llama. This model is a derivative of Llama‑Krikri‑8B‑Instruct, which is itself built on Llama-3.1-8B. Use of this model is governed by the Llama 3.1 Community License Agreement.

Model Details

Base Model: ilsp/Llama-Krikri-8B-Instruct (Llama 3 architecture)
Method: QLoRA (Rank=32, Alpha=32)
Training Data: ~130k sentence pairs from the AG-MG Corpus.

Usage

You need to load the base model and then load the Peft adapter. This model requires the exact system prompt used during training for optimal results.


import torch

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

from peft import PeftModel

# 1. Setup paths

base_model_id = "ilsp/Llama-Krikri-8B-Instruct"

adapter_id = "ilsp/llama-krikri-8b-ag-mg-qlora"

# 2. Load Tokenizer

tokenizer = AutoTokenizer.from_pretrained(adapter_id, use_fast=True)

# 3. Load Base Model (4-bit)

bnb_config = BitsAndBytesConfig(

    load_in_4bit=True,

    bnb_4bit_quant_type="nf4",

    bnb_4bit_compute_dtype=torch.bfloat16,

    bnb_4bit_use_double_quant=True

)

base_model = AutoModelForCausalLM.from_pretrained(

    base_model_id,

    quantization_config=bnb_config,

    device_map="auto",

    attn_implementation="eager" # or "sdpa" if available

)

# 4. Load Adapter

model = PeftModel.from_pretrained(base_model, adapter_id)

model.eval()

# 5. Define Prompt & Inference

sys_prompt = "Είσαι ακριβές σύστημα μεταφράσεων. Μεταφράζεις από Αρχαία Ελληνικά (πολυτονικό) σε Νέα Ελληνικά. Δώσε μόνο τη μετάφραση."

text = "Ὦ ξεῖν', ἀγγέλλειν Λακεδαιμονίοις ὅτι τῇδε κείμεθα."

messages = [

    {"role": "system", "content": sys_prompt},

    {"role": "user", "content": f"Μετάφρασε στα Νέα Ελληνικά:\n{text}"}

]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():

    outputs = model.generate(

        **inputs,

        max_new_tokens=256,

        do_sample=False, # Greedy decoding

        temperature=0.0,

        repetition_penalty=1.05,

        eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>")]

    )

# Decode only the new tokens

generated_text = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)

print(generated_text.strip())

Performance

Main Test Set Results

Evaluated on the 2,000 sentence-pairs Test Set (Attic & Koine Hellenistic dialects).

Model	Method	BLEU ↑	chrF++ ↑	TER ↓	BERTScore F1 ↑	COMET ↑	ΔBLEU
NLLB-600M	Base	1.55	16.86	106.80	0.880	0.539	-
	LoRA	7.43	29.31	88.32	0.903	0.667	+5.88
NLLB-1.3B	Base	2.15	17.78	106.41	0.885	0.573	-
	LoRA	8.01	30.02	87.74	0.905	0.687	+5.86
M2M100-1.2B	Base	0.62	10.70	100.50	0.858	0.475	-
	QLoRA	10.96	33.09	82.99	0.911	0.710	+10.34
	Full FT	9.60	31.16	83.43	0.908	0.692	+8.98
Krikri-8B-Instruct	Base	8.29	29.87	88.13	0.895	0.695	-
👉	QLoRA	11.90	34.07	84.16	0.906	0.713	+3.60
	Full FT	13.16	34.71	83.68	0.848	0.702	+4.45

Stress Set Results (Rare Dialects)

Evaluated on the 250 sentence-pairs Stress Set (Ionic, Doric, Homeric dialects).

Model	Method	BLEU ↑	chrF++ ↑	TER ↓	BERTScore F1 ↑	COMET ↑	ΔBLEU
NLLB-600M	Base	0.77	14.40	118.13	0.866	0.484	-
	LoRA	5.65	28.74	88.01	0.900	0.638	+4.89
NLLB-1.3B	Base	1.25	16.15	107.03	0.873	0.525	-
	LoRA	5.68	28.94	88.24	0.900	0.656	+4.43
M2M100-1.2B	Base	0.07	9.37	100.34	0.840	0.427	-
	QLoRA	9.52	33.30	81.95	0.911	0.691	+9.45
	Full FT	8.16	31.12	83.11	0.907	0.664	+8.09
Krikri-8B-Instruct	Base	6.55	28.98	87.38	0.900	0.675	-
👉	QLoRA	10.37	34.09	82.28	0.911	0.717	+3.82
	Full FT	12.80	35.90	81.40	0.884	0.716	+6.11

Citation

If you use this model, please cite our LREC 2026 paper:

Mavromatis, S., Sofianopoulos, S., Prokopidis, P., & Giagkou, M. (2026). Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models. In Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026) (pp. 8685–8698). European Language Resources Association (ELRA). https://doi.org/10.63317/4cdk64dgm2w9

@inproceedings{mavromatis-etal-2026-ancient,
  title     = {Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models},
  author    = {Mavromatis, Spyridon and Sofianopoulos, Sokratis and Prokopidis, Prokopis and Giagkou, Maria},
  booktitle = {Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)},
  month     = {May},
  year      = {2026},
  pages     = {8685--8698},
  address   = {Palma, Mallorca, Spain},
  publisher = {European Language Resources Association (ELRA)},
  editor    = {Piperidis, Stelios and Bel, Núria and van den Heuvel, Henk and Ide, Nancy and Krek, Simon and Toral, Antonio},
  doi       = {10.63317/4cdk64dgm2w9}
}

Note on resources: The fine-tuned models are publicly released. The accompanying AG-MG Parallel Corpus is not publicly distributed due to the complex and uncertain copyright status of the source materials.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for ilsp/llama-krikri-8b-ag-mg-qlora

Base model

ilsp/Llama-Krikri-8B-Base

Finetuned

ilsp/Llama-Krikri-8B-Instruct

Finetuned

(10)

this model