Africa v1 Fused Model

This is the fully merged (fused) version of the Africa v1 translation model, combining the Qwen3-4B-Instruct-2507 base model with LoRA adapters.

Model Description

This model is a fused version of the Africa v1 translation model, where the LoRA adapters have been merged into the base model weights. This provides:

Standalone model - No need to load separate adapters
Full precision - Model weights in safetensors format (not quantized)
Direct deployment - Ready to use with standard inference frameworks

The model supports translation between English and 29 African languages.

Format

File Format: SafeTensors
Size: ~2.1 GB
Quantization: None (full precision from MLX 4-bit base)
Architecture: Qwen3-4B with merged LoRA weights

Supported Languages (29)

African Languages:

Afrikaans (af), Akan (ak), Amharic (am), Bambara (bm), Ewe (ee)
Fula (ff), Hausa (ha), Igbo (ig), Kinyarwanda (rw), Kirundi (rn)
Kongo (kg), Lingala (ln), Luganda (lg), Ndebele (nd), Northern Sotho (nso)
Chichewa/Nyanja (ny), Oromo (om), Shona (sn), Somali (so), Swahili (sw)
Tigrinya (ti), Tsonga (ts), Tswana (tn), Twi (tw), Venda (ve)
Wolof (wo), Xhosa (xh), Yoruba (yo), Zulu (zu)

Plus English (en) for bidirectional translation.

Training Details

Base Model

Model: Qwen3-4B-Instruct-2507
Parameters: 4 billion
Architecture: Transformer-based language model

LoRA Fine-tuning (Merged)

LoRA Rank: 8
LoRA Alpha: 20
Target Layers: 16 layers
Training Iterations: 10,000
Learning Rate: 5e-5

Fusion Process

The model was created by:

Training LoRA adapters on the MLX 4-bit quantized base model
Merging adapters into base weights using mlx_lm.fuse
Exporting as standalone model with config and tokenizer

Usage

HuggingFace Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("aoiandroid/africa-v1-fused-model")
tokenizer = AutoTokenizer.from_pretrained("aoiandroid/africa-v1-fused-model")

# Prepare translation prompt
prompt = "Translate from English to Swahili:\n\nHello, how are you?"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate translation
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translation)

MLX (Apple Silicon)

from mlx_lm import load, generate

# Load model
model, tokenizer = load("aoiandroid/africa-v1-fused-model")

# Generate translation
prompt = "Translate from English to Swahili:\n\nHello, how are you?"
response = generate(model, tokenizer, prompt=prompt, max_tokens=256, temp=0.1)
print(response)

vLLM (Fast Inference)

from vllm import LLM, SamplingParams

# Initialize model
llm = LLM(model="aoiandroid/africa-v1-fused-model")

# Generate translation
sampling_params = SamplingParams(temperature=0.1, max_tokens=256)
outputs = llm.generate(
    ["Translate from English to Swahili:\n\nHello, how are you?"],
    sampling_params
)

for output in outputs:
    print(output.outputs[0].text)

Advantages Over Other Formats

Format	Size	Standalone	Speed	Precision
Fused Model	2.1 GB	Yes	Fast	Full (from 4-bit base)
LoRA Adapters	29 MB	No	Fast	N/A
GGUF Q4_K_M	2.3 GB	Yes	Very Fast	4-bit
MLX 4-bit	2.1 GB	Yes	Very Fast	4-bit

Use this format when:

You want a standalone model without separate adapters
You're using standard inference frameworks (Transformers, vLLM)
You need compatibility with cloud deployment services

Evaluation

Same evaluation results as Africa v1:

Metric	Score	Interpretation
Non-empty outputs	30/30 (100%)	All samples generate output
BLEU	0.71	Very low - experimental model
chrF	9.24	Low character-level overlap
TER	362.19	High edit distance

See Africa v1 model card for detailed evaluation.

Limitations and Biases

Experimental Model: This is v1 with known quality issues
Repetition: Model may get stuck in repetition loops
Hallucination: May generate fluent but incorrect translations
Low-Resource Languages: Limited training data for some African languages
English-Centric: Best performance on English↔African pairs

Intended Use

Research: Exploring multilingual translation for African languages
Experimentation: Testing deployment with standard frameworks
Development: Building translation applications (with quality caveats)

Not recommended for production use. Use v2 or specialized translation models for production.

Model Variants

This repository contains the fused model. For other formats:

GGUF + LoRA + MLX: africa-v1-translation-model
Improved v2: africa-v2-translation-model

Citation

@software{africa_v1_fused_model,
  title = {Africa v1 Fused Translation Model},
  author = {TranslateBlue Project},
  year = {2026},
  url = {https://huggingface.co/aoiandroid/africa-v1-fused-model}
}

License

Apache 2.0

Model Card Authors

TranslateBlue Project

Model Card Contact

For questions or issues, please open an issue in the model repository.

Downloads last month: 7

Safetensors

Model size

0.6B params

Tensor type

BF16

U32

Model tree for aoiandroid/africa-v1-fused-model

Base model

Qwen/Qwen3-4B-Instruct-2507

Quantized

(224)

this model