Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees Paper • 2506.14606 • Published Jun 17 • 11
Filling the Gap for Uzbek: Creating Translation Resources for Southern Uzbek Paper • 2508.14586 • Published Aug 20 • 3
Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters Paper • 2507.13618 • Published Jul 18 • 16
Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh Paper • 2503.01493 • Published Mar 3 • 1
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study Paper • 2502.02481 • Published Feb 4 • 16
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 158
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation Paper • 2412.03304 • Published Dec 4, 2024 • 21
Open Language Data Initiative: Advancing Low-Resource Machine Translation for Karakalpak Paper • 2409.04269 • Published Sep 6, 2024 • 11
dilmash release Collection Dilmash: Karakalpak Machine Translation • 5 items • Updated Sep 10, 2024 • 3
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset Paper • 2309.04662 • Published Sep 9, 2023 • 24