Image-Text-to-Text
Transformers
English
Bee-8B
Fully-Open-MLLMs

Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

[🏠 Homepage] [πŸ“– Arxiv Paper] [πŸ€— Models & Datasets] [πŸ’» Code(coming soon)]

Introduction

We introduce Bee-8B, a new state-of-the-art, fully open 8B Multimodal Large Language Model (MLLM) designed to close the performance gap with proprietary models by focusing on data quality.

Bee-8B is trained on our new Honey-Data-15M corpus, a high-quality supervised fine-tuning (SFT) dataset of approximately 15 million samples. This dataset was meticulously created with our transparent, adaptable, and open-source data curation pipeline, HoneyPipe, which systematically cleans noisy data and enriches it with a novel dual-level (short and long) Chain-of-Thought (CoT) strategy.

This dataset enables Bee-8B to achieve exceptional performance, particularly in complex reasoning, establishing a new standard for fully open MLLMs.

Key Features

  • High-Quality, Large-Scale Dataset: We release Honey-Data-15M, a new 15M-sample SFT corpus. It has undergone extensive cleaning to remove widespread noise and has been enriched with dual-level CoT reasoning to enhance advanced problem-solving capabilities.
  • Fully Open-Source Data Curation Suite: We provide not just the data, but the entire methodology. HoneyPipe and its underlying framework DataStudio offer the community a transparent and reproducible pipeline, moving beyond static dataset releases.
  • State-of-the-Art Open Model: Our model, Bee-8B, achieves state-of-the-art performance among fully open MLLMs and is highly competitive with recent semi-open models like InternVL3.5-8B, demonstrating the power of high-quality data.

News

  • [2025.12.17] πŸ”₯ We have released all data and model weights across different stages. For the final stage (RL data), you can directly merge ViRL39K and MMK12 and use the VeRL framework for training.

  • [2025.11.03] πŸ“Š Honey-Data-15M & Honey-Data-1M is Released! You can download the 15M full version and the 1M efficient version from HuggingFace.

  • [2025.10.20] πŸš€ vLLM Support is Here! Bee-8B now supports high-performance inference with vLLM, enabling faster and more efficient deployment for production use cases.

  • [2025.10.13] 🐝 Bee-8B is Released! Our model is now publicly available. You can download it from Hugging Face.

Bee-8B-Stage1

This is NOT a complete model and cannot be used for inference directly.

This repository contains the MLP projector weights that bridge the vision encoder (SigLIP2) and the language model (Qwen3-8B).

Weights:

Key Shape Description
model.multi_modal_projector.pre_norm.weight [1152] Pre-normalization weight
model.multi_modal_projector.pre_norm.bias [1152] Pre-normalization bias
model.multi_modal_projector.linear_1.weight [4096, 1152] First linear layer
model.multi_modal_projector.linear_1.bias [4096] First linear bias
model.multi_modal_projector.linear_2.weight [4096, 4096] Second linear layer
model.multi_modal_projector.linear_2.bias [4096] Second linear bias

Acknowledgements

Bee-8B is developed based on the architectures and codebases of the following projects: R-4B, LLaVA-OneVision, SigLIP2, Qwen3, and evaluated using VLMEvalKit. We sincerely thank these projects for their outstanding contributions to the open-source community.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Open-Bee/Bee-8B-Stage1

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Finetuned
(700)
this model

Dataset used to train Open-Bee/Bee-8B-Stage1

Collection including Open-Bee/Bee-8B-Stage1