JMMMU

non-profit

https://mmmu-japanese-benchmark.github.io/JMMMU/

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

AtsuMiyai authored a paper 17 days ago

JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction

AtsuMiyai submitted a paper 17 days ago

JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction

AtsuMiyai updated a Space 17 days ago

JMMMU/JMMMU-Pro_Leaderboard

View all activity

Papers

JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction

View all Papers

AtsuMiyai

authored a paper 17 days ago

JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction

Paper • 2512.14620 • Published 18 days ago • 1

AtsuMiyai

submitted a paper to Daily Papers 17 days ago

JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction

Paper • 2512.14620 • Published 18 days ago • 1

AtsuMiyai

updated a Space 17 days ago

JMMMU-Pro Leaderboard

🥇

Evaluating LMMs on Image-based Japanese VQA

AtsuMiyai

updated a dataset 17 days ago

JMMMU/JMMMU-Pro

Viewer • Updated 17 days ago • 2.64k • 438 • 7

yuexiang96

authored 4 papers 17 days ago

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Paper • 2510.24702 • Published Oct 28, 2025 • 28

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Paper • 2510.25726 • Published Oct 29, 2025 • 45

Simulating Environments with Reasoning Models for Agent Training

Paper • 2511.01824 • Published Nov 3, 2025 • 2

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Paper • 2512.07783 • Published 26 days ago • 36

AtsuMiyai

published a Space 18 days ago

JMMMU-Pro Leaderboard

🥇

Evaluating LMMs on Image-based Japanese VQA

AtsuMiyai

published a dataset 27 days ago

JMMMU/JMMMU-Pro

Viewer • Updated 17 days ago • 2.64k • 438 • 7

AtsuMiyai

authored a paper about 2 months ago

Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper

Paper • 2511.04583 • Published Nov 6, 2025 • 2

yuki-imajuku

authored a paper 3 months ago

ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution

Paper • 2509.19349 • Published Sep 17, 2025 • 2

yuexiang96

authored 8 papers 6 months ago

Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time

Paper • 2504.12329 • Published Apr 12, 2025

Overtrained Language Models Are Harder to Fine-Tune

Paper • 2503.19206 • Published Mar 24, 2025 • 2

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Paper • 2505.10185 • Published May 15, 2025 • 26

AI & ML interests

Recent Activity

Papers

Team members 6

JMMMU's activity

JMMMU-Pro Leaderboard

JMMMU-Pro Leaderboard