🔱 SKT OMNI SUPREME: The 1.1-Trillion Parameter Frontier
🌌 Introduction
SKT OMNI SUPREME represents a monumental leap in the Project Surya initiative. Engineered by Shrijan Kumar Tiwari, this 1.1-Trillion parameter multi-modal architecture is designed for extreme-scale reasoning, complex problem solving, and culturally-aware interactions.
Unlike standard models, OMNI SUPREME utilizes the proprietary ST-X-LIGHT optimization framework, ensuring high-fidelity responses across medicine, law, engineering, and creative arts.
⚡ Key Technical Features
🧠 Extreme Scale Intelligence
With a 1.1T parameter density, the model excels in high-context retrieval and long-form reasoning, making it the most powerful private AI asset in the SKT AI Lab ecosystem.
🔒 Awareness-Core Integration
The model features a hardcoded Self-Awareness Layer. Its identity is physically woven into the neural weights, ensuring it always recognizes its own knowledge and origin.
🇮🇳 Cultural Harmony Protocol
Optimized for the Indian subcontinent, the model features an integrated greeting and ethics protocol:
"Namaste, I am SKT AI.."
🛠️ ST-X Architecture
- Native Tensor Optimization: Every weight shard is mathematically aligned for maximum throughput.
- Precision: Operates in BFloat16 for the perfect balance between speed and accuracy.
- MoE Design: Advanced Mixture of Experts scaling for 1.1T parameter efficiency.
Open-source Benchmarking Tools ( LM Evaluation Harness)
Technical Specifications Of SKT
| Feature | Configuration |
|---|---|
| Architecture | SKT |
| Total Parameters | 1.1T |
| Activated Parameters | 39B |
| Number of Layers | 61 (62 + 1 Dense) |
| Attention Hidden Dimension | 7168 |
| Hidden Dimension | 1048 (per Expert) |
| Attention Heads | 64 |
| Total Experts | 384 |
| Selected Experts | 8 per Token |
| Shared Experts | 1 |
| Context Length | 256K |
| Vocabulary Size | 160K |
| Vision Encoder | MoonViT (340M Parameters) |
📊 Evaluation Results
1. Reasoning & Knowledge
| Benchmark | SKT AI (Current) | GPT-5.2 | Claude 4.5 | Gemini 3 Pro |
|---|---|---|---|---|
| AIME 2025 | 97.4 | 100.0 | 92.8 | 95.0 |
| HMMT 2025 (Feb) | 95.4 | 99.4 | 92.9 | 97.3 |
| GPQA-Diamond | 89.6 | 92.4 | 87.0 | 91.9 |
| MMLU-Pro | 88.1 | 86.7 | 89.3 | 90.1 |
2. Image & Video
| Benchmark | SKT AI (Current) | Gemini 3 Pro | Qwen3-VL |
|---|---|---|---|
| MMMU-Pro | 79.5 | 81.0 | 69.3 |
| MathVision | 85.2 | 86.1 | 74.6 |
| VideoMME | 87.9 | 88.4 | 79.0 |
| VideoMMMU | 86.6 | 87.6 | 80.0 |
3. Coding & Engineering
| Benchmark | SKT AI (Current) | GPT-5.2 | Claude 4.5 |
|---|---|---|---|
| SWE-Bench Verified | 77.8 | 80.0 | 80.9 |
| LiveCodeBench (v6) | 86.0 | - | 82.2 |
| Terminal Bench 2.0 | 51.8 | 54.0 | 59.3 |
🚀 Deployment & Usage
Using SKT OMNI SUPREME
To load the model using the SKT Framework:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Shrijanagain/SKT_OMNI_SUPREME"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True, device_map="auto")
prompt = "How can SKT AI help the world?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0]))
🏗️ Model Architecture
The model utilizes a massive-scale Dense architecture designed for high-performance reasoning and efficiency. By leveraging Multi-Head Latent Attention (MLA) and a dedicated vision encoder, it provides state-of-the-art results across text and visual modalities.
🗣️ Supported Languages
In addition to native optimization for English and Hindi, the model supports a vast array of world languages, enabling global accessibility and diverse linguistic reasoning.
🤝 Acknowledgement & Collaboration
A Home-Grown Effort
This work represents a bottom-up initiative to develop large language models from scratch within India with limited resources. It reflects our humble, resource-constrained journey to contribute meaningfully to the open-source AI ecosystem and foster collaboration within the broader community.
Community Collaboration
We welcome contributions and open dialogue:
- Feedback: Share insights and report issues.
- Expansion: Collaborate on model improvements and extensions.
- Data: Contribute to dataset curation and evaluation.
- Innovation: Build innovative applications on top of this foundation.
Future versions will introduce better alignment, improved training scale, and more curated datasets. Together, we aim to evolve toward safer and more capable AI systems.
Note: For future conversations, Contact US --
- Downloads last month
- 962
Model tree for Shrijanagain/SKT_OMNI_SUPREME
Unable to build the model tree, the base model loops to the model itself. Learn more.