Chinese University of Hong Kong, Shenzhen

university

https://www.cuhk.edu.cn/

Activity Feed Request to join this org

AI & ML interests

NLP, CV

Recent Activity

guangyil authored a paper 20 days ago

Character Mixing for Video Generation

guangyil authored a paper 20 days ago

PAN: A World Model for General, Interactable, and Long-Horizon World Simulation

weipang142857 authored a paper about 1 month ago

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

View all activity

Eric3200

authored a paper 3 months ago

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

Paper • 2509.09307 • Published Sep 11 • 6

KurtDu

authored 4 papers 3 months ago

BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement

Paper • 2412.14203 • Published Dec 16, 2024 • 1

S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information

Paper • 2503.05085 • Published Mar 7 • 47

MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols

Paper • 2508.18240 • Published Aug 22

EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs

Paper • 2509.09174 • Published Sep 11 • 60

Eric3200

authored 2 papers 11 months ago

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Paper • 2412.20070 • Published Dec 28, 2024 • 45

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 104

chongjie

authored 3 papers over 1 year ago

StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal

Paper • 2406.16864 • Published Jun 24, 2024 • 3

LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset

Paper • 2312.12418 • Published Dec 19, 2023 • 2

MVImgNet: A Large-scale Dataset of Multi-view Images

Paper • 2303.06042 • Published Mar 10, 2023

Eric3200

authored a paper over 1 year ago

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

Paper • 2406.19280 • Published Jun 27, 2024 • 63

OBB1028

authored a paper over 1 year ago

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale

Paper • 2406.19280 • Published Jun 27, 2024 • 63

shawkui

authored a paper over 1 year ago

VDC: Versatile Data Cleanser for Detecting Dirty Samples via Visual-Linguistic Inconsistency

Paper • 2309.16211 • Published Sep 28, 2023