ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development Paper • 2601.11077 • Published 7 days ago • 62
view post Post 683 More lightweight multimodal models are coming 👀StepFun has been focused on multimodal AI from the very beginning. Their latest release a new foundational model: STEP3-VL🔥 https://huggingface.co/collections/stepfun-ai/step3-vl-10b✨ 10B - Apache2.0✨ Leads in the 10B class and competes with models 10–20× larger See translation 🔥 2 2 + Reply
view post Post 627 MOSS Transcribe Diarize 🔊 A multimodal model for Speaker-Attributed, Time-Stamped Transcription from OpenMOSS. MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization (2601.01554) OpenMOSS-Team/MOSS-transcribe-diarize✨ Single-pass end-to-end SATS✨ 128k context, ~90 min audio✨ Robust to overlap & noise See translation 1 reply · 🔥 2 2 + Reply
Klear: Unified Multi-Task Audio-Video Joint Generation Paper • 2601.04151 • Published 15 days ago • 15
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published 18 days ago • 54
Running Featured 46 MOSS Transcribe Diarize 🏢 46 Transcribe audio/video files with speaker identification
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published Dec 8, 2025 • 59
Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction Paper • 2512.04987 • Published Dec 4, 2025 • 80