BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published Mar 13 • 55
Efficient Feature Distillation for Zero-shot Annotation Object Detection Paper • 2303.12145 • Published Mar 21, 2023
ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation Paper • 2308.03793 • Published Aug 4, 2023 • 12
Implicit Neural Representation Facilitates Unified Universal Vision Encoding Paper • 2601.14256 • Published Jan 20 • 7
ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations Paper • 2606.11188 • Published 22 days ago • 26
SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization Paper • 2009.00726 • Published Sep 1, 2020
MixNorm: Test-Time Adaptation Through Online Normalization Estimation Paper • 2110.11478 • Published Oct 21, 2021
Large Language Models are Good Prompt Learners for Low-Shot Image Classification Paper • 2312.04076 • Published Dec 7, 2023
BitLM: Unlocking Multi-Token Language Generation with Bitwise Continuous Diffusion Paper • 2605.11577 • Published May 12
SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification Paper • 2103.16725 • Published Mar 30, 2021
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners Paper • 2504.14239 • Published Apr 19, 2025 • 14
Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models Paper • 2505.23091 • Published May 29, 2025
Infi-Med: Low-Resource Medical MLLMs with Robust Reasoning Evaluation Paper • 2505.23867 • Published May 29, 2025 • 1
InfiAlign: A Scalable and Sample-Efficient Framework for Aligning LLMs to Enhance Reasoning Capabilities Paper • 2508.05496 • Published Aug 7, 2025 • 9
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization Paper • 2508.05731 • Published Aug 7, 2025 • 27
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization Paper • 2508.05731 • Published Aug 7, 2025 • 27
DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling Paper • 2505.11196 • Published May 16, 2025 • 14
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners Paper • 2504.14239 • Published Apr 19, 2025 • 14