A newer version of this model is available: Simonc-44/Cygnis-Alpha-2-8B-v0.3

Cygnis-Alpha-2 8B v0.2

The Sovereign Reasoning Engine by Simonc-44

TECHNICAL REPORT

GGUF VERSIONS

LIVE DEMO

APACHE 2.0

Model Card for Cygnis-Alpha-2 8B v0.2

Cygnis-Alpha-2 8B v0.2 is the full, independent version of the Cygnis Alpha model. Unlike v0.1, which was a LoRA adapter, this release contains merged weights (Full Weights), allowing it to operate as a standalone model without requiring a third-party base model.

Optimized by Simonc-44, this model implements a systematic reasoning process (Chain-of-Thought) before generating final outputs, ensuring logical consistency and enhanced performance in both French and English.

Model Architecture

Cygnis-Alpha-2 8B v0.2 is based on a Llama 3.1 architecture, featuring:

Merged Weights: Independent execution without base model dependency.
Reasoning Capabilities: Integrated CoT processing to reduce hallucinations.
Native ChatML Support: Optimized for structured role-based interactions.

Parameter	Value
Architecture	Llama 3.1
Weight Size	16.1 GB (F32)
Format	Safetensors
Context Window	8192 tokens
Developer	Simonc-44

Performance Benchmarks

Comparaison estimée par rapport aux modèles de taille similaire (3B Parameters)

Dataset	Cygnis v0.2	Llama 3.2 (Base)	Gemma 2 2B
GSM8K	45.8*	43.5	38.0
IFEval	61.2*	58.0	50.4

* Scores en cours de validation sur l'Open LLM Leaderboard.

Instruction Format

Cygnis-Alpha-2 8B v0.2 utilizes the ChatML format. For optimal results, use the following structure:

| **MMLU** | **52.4*** | 49.3 | 42.1 |
<|im_start|>system
You are Cygnis Alpha 2, a sovereign AI created by Simonc-44. You are concise, clear, and helpful.<|im_end|>
<|im_start|>user
[Your question here]<|im_end|>
<|im_start|>assistant
<|im_thought|>
[Model's internal reasoning...]
<|im_end|>
[Final response]

Quickstart

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Simonc-44/Cygnis-Alpha-2-8B-v0.2"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [{"role": "user", "content": "Explain the concept of digital sovereignty."}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=500)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

GGUF Versions

The optimized GGUF versions are now fully available, ranging from Q2_K to Q8_0, including the full FP16 weights.

These models have been specifically quantized using llama.cpp to ensure that the Reasoning Engine (<|im_thought|>) maintains its full logic integrity and chain-of-thought capabilities, even at lower bitrates.

Available Quantizations

File Name	Quant Method	Size	Best For
`cygnis-alpha-8b-v2.5.fp16.gguf`	F16	16.1 GB	Original weights, no quality loss.
`cygnis-alpha-8b-v2.5.Q8_0.gguf`	Q8_0	8.54 GB	Near-lossless precision (High-end PC).
`cygnis-alpha-8b-v2.5.Q6_K.gguf`	Q6_K	6.60 GB	Excellent quality, significant space saving.
`cygnis-alpha-8b-v2.5.Q5_K_M.gguf`	Q5_K_M	5.73 GB	High accuracy, slightly more demanding.
`cygnis-alpha-8b-v2.5.Q4_K_M.gguf`	Q4_K_M	4.92 GB	Recommended - Best balance for most users.
`cygnis-alpha-8b-v2.5.Q3_K_L.gguf`	Q3_K_L	4.32 GB	Good for older hardware or lower RAM.
`cygnis-alpha-8b-v2.5.Q2_K.gguf`	Q2_K	3.18 GB	Extreme compression for Mobile / Edge.

👉 Access the GGUF Repository here: Simonc-44/Cygnis-Alpha-2-7B-v0.2-GGUF

Notice

Cygnis-Alpha-2 8B v0.2 is a fine-tuned model and does not have built-in moderation mechanisms. Users should be aware that the model may reflect biases present in the training data or base architecture.