VideoMamba: State Space Model for Efficient Video Understanding Paper • 2403.06977 • Published Mar 11, 2024 • 30
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning Paper • 2410.19702 • Published Oct 25, 2024 • 1
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling Paper • 2501.00574 • Published Dec 31, 2024 • 6
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling Paper • 2501.12386 • Published Jan 21 • 1
Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method Paper • 2501.00584 • Published Dec 31, 2024
Fine-grained Video-Text Retrieval: A New Benchmark and Method Paper • 2501.00513 • Published Dec 31, 2024
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model Paper • 2407.06491 • Published Jul 9, 2024
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Paper • 2403.15377 • Published Mar 22, 2024 • 26