ViDoRe V3 is our latest benchmark, engineered to set a new industry gold standard for multi-modal, enterprise document retrieval evaluation.
AI & ML interests
Retrieval, Computer Vision, LLM
Recent Activity
View all activity
Pre-trained checkpoints for the ColPali model.
Pre-trained checkpoints for the ColVision models with a ColSmolVLM backbone.
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format.
The ViDoRe benchmark was passed to Unstructured to partition each page into text chunks. Detected figures/tables were captioned with Claude 3-Sonnet.
-
vidore/arxivqa_test_subsampled_ocr_chunk
Viewer • Updated • 1.44k • 35 -
vidore/docvqa_test_subsampled_ocr_chunk
Viewer • Updated • 1.24k • 56 -
vidore/infovqa_test_subsampled_ocr_chunk
Viewer • Updated • 2.78k • 45 -
vidore/tabfquad_test_subsampled_ocr_chunk
Viewer • Updated • 636 • 14
ViDoRe benchmark with the full OCR text of each page. ⚠️ This dataset serves a intermediate step → use "ViDoRe Chunk OCR (baseline)" for evaluation!
Pre-trained checkpoints for the ColQwen2 model.
Models that can be used with the native transformers 🤗 implementation instead of colpali-engine.
-
vidore/colqwen2-v1.0-hf
Visual Document Retrieval • 2B • Updated • 4.96k • 21 -
vidore/colpali-v1.3-hf
Visual Document Retrieval • 3B • Updated • 1.84k • 26 -
vidore/colpali-v1.2-hf
Visual Document Retrieval • 3B • Updated • 2.4k • 8 -
Sahil-Kabir/colqwen2.5-v0.2-hf
4B • Updated • 1.56k
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the BEIR format.
-
vidore/arxivqa_test_subsampled_beir
Viewer • Updated • 1.5k • 6.44k • 1 -
vidore/docvqa_test_subsampled_beir
Viewer • Updated • 1.45k • 4.72k -
vidore/infovqa_test_subsampled_beir
Viewer • Updated • 1.49k • 4.42k -
vidore/tabfquad_test_subsampled_beir
Viewer • Updated • 630 • 4.4k
Main resources for the paper: "ColPali: Efficient Document Retrieval with Vision Language Models"
-
ColPali: Efficient Document Retrieval with Vision Language Models
Paper • 2407.01449 • Published • 50 -
vidore/colpali
Visual Document Retrieval • Updated • 5.65k • 464 -
vidore/colpali_train_set
Viewer • Updated • 119k • 4.73k • 88 -
Vidore Leaderboard
🥇189Browse and compare visual document retrieval models
ViDoRe V3 is our latest benchmark, engineered to set a new industry gold standard for multi-modal, enterprise document retrieval evaluation.
Pre-trained checkpoints for the ColPali model.
Pre-trained checkpoints for the ColQwen2 model.
Pre-trained checkpoints for the ColVision models with a ColSmolVLM backbone.
Models that can be used with the native transformers 🤗 implementation instead of colpali-engine.
-
vidore/colqwen2-v1.0-hf
Visual Document Retrieval • 2B • Updated • 4.96k • 21 -
vidore/colpali-v1.3-hf
Visual Document Retrieval • 3B • Updated • 1.84k • 26 -
vidore/colpali-v1.2-hf
Visual Document Retrieval • 3B • Updated • 2.4k • 8 -
Sahil-Kabir/colqwen2.5-v0.2-hf
4B • Updated • 1.56k
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format.
Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the BEIR format.
-
vidore/arxivqa_test_subsampled_beir
Viewer • Updated • 1.5k • 6.44k • 1 -
vidore/docvqa_test_subsampled_beir
Viewer • Updated • 1.45k • 4.72k -
vidore/infovqa_test_subsampled_beir
Viewer • Updated • 1.49k • 4.42k -
vidore/tabfquad_test_subsampled_beir
Viewer • Updated • 630 • 4.4k
The ViDoRe benchmark was passed to Unstructured to partition each page into text chunks. Detected figures/tables were captioned with Claude 3-Sonnet.
-
vidore/arxivqa_test_subsampled_ocr_chunk
Viewer • Updated • 1.44k • 35 -
vidore/docvqa_test_subsampled_ocr_chunk
Viewer • Updated • 1.24k • 56 -
vidore/infovqa_test_subsampled_ocr_chunk
Viewer • Updated • 2.78k • 45 -
vidore/tabfquad_test_subsampled_ocr_chunk
Viewer • Updated • 636 • 14
Main resources for the paper: "ColPali: Efficient Document Retrieval with Vision Language Models"
-
ColPali: Efficient Document Retrieval with Vision Language Models
Paper • 2407.01449 • Published • 50 -
vidore/colpali
Visual Document Retrieval • Updated • 5.65k • 464 -
vidore/colpali_train_set
Viewer • Updated • 119k • 4.73k • 88 -
Vidore Leaderboard
🥇189Browse and compare visual document retrieval models
ViDoRe benchmark with the full OCR text of each page. ⚠️ This dataset serves a intermediate step → use "ViDoRe Chunk OCR (baseline)" for evaluation!