Peng "Richard" Xia (夏鹏)
I am a Ph.D. student at Department of Computer Science, The University of North Carolina at Chapel Hill (UNC-Chapel Hill), advised by Prof. Huaxiu Yao.
Before that, I was briefly enrolled (2023-2024) as a Ph.D. student at Monash University, advised by A/Prof. Zongyuan Ge.
Formerly, I worked as a research intern at Tongyi Lab, Alibaba and Microsoft Research.
My research focuses on developing and enhancing intelligent Multimodal Agents that can effectively perceive, comprehend, and reason from diverse, dynamic data. My work specifically leverages retrieval-based methods and multi-step tool interaction to boost the system's reasoning capabilities and decision-making efficiency, with a primary application in medical scenarios.
I am always open to collaboration. Feel free to drop me an e-mail. :-)
I’m open to 2026 summer internships, feel free to reach out!
Email: richard.peng.xia AT gmail DOT com; pxia AT cs DOT unc DOT edu
 / 
 / 
 / 
 / 
 / 
 / 
|
|
News
-
May.2025: One paper was accepted by ICML 2025.
-
Jan.2025: Three papers were accepted by ICLR 2025 and MMIE was selected as an oral presentation.
-
Dec.2024: Invited talk at Cohere For AI, one paper was accepted by COLING 2025, two papers were accepted by AAAI 2025.
-
Sep.2024: One paper was accepted by NeurIPS 2024 and one paper was accepted by EMNLP 2024.
-
Jul.2024: One paper was accepted by ECCV 2024.
-
Jun.2024: Two papers were accepted by MICCAI 2024 and one was early accepted.
-
Sep.2023: One paper was accepted by NeurIPS 2023.
-
Aug.2022: Share paper list about multi-modal learning in medical imaging.
|
WebWatcher: Breaking New Frontiers of Vision-Language Deep Research Agent
Xinyu Geng*, Peng Xia*, Zhen Zhang*, Xinyu Wang, Qiuchen Wang, Ruixue Ding, Chenxi Wang, Jialong Wu, Yida Zhao, Kuan Li, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou
arXiv preprint, 2025.
[Paper]
[Code]
|
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
Xiangru Tang, Tianrui Qin, Tianhao Peng, Ziyang Zhou, Daniel Shao, Tingting Du, Xinming Wei, Peng Xia, Fang Wu, He Zhu, Ge Zhang, Jiaheng Liu, Xingyao Wang, Sirui Hong, Chenglin Wu, Hao Cheng, Chi Wang, Wangchunshu Zhou
arXiv preprint, 2025.
[Paper]
[Code]
|
MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning
Peng Xia, Jinglu Wang, Yibo Peng, Kaide Zeng, Xian Wu, Xiangru Tang, Hongtu Zhu, Yun Li, Yan Lu, Huaxiu Yao
arXiv preprint, 2025.
[Paper]
[Code]
|
MMIE: Massive Multimodal Interleaved Comprehension Benchmark For Large Vision-Language Models.
Peng Xia*, Siwei Han*, Shi Qiu*, Yiyang Zhou, Zhaoyang Wang, Wenhao Zheng, Zhaorun Chen, Chenhang Cui, Mingyu Ding, Linjie Li, Lijuan Wang, Huaxiu Yao
International Conference on Learning Representations (ICLR), 2025. (Oral)
[Paper]
[Code]
[Project Page]
|
AnyPrefer: An Automatic Framework for Preference Data Synthesis.
Yiyang Zhou*, Zhaoyang Wang*, Tianle Wang*, Shangyu Xing, Peng Xia, Bo Li, Kaiyuan Zheng, Zijian Zhang, Zhaorun Chen, Wenhao Zheng, Xuchao Zhang, Chetan Bansal, Weitong Zhang, Ying Wei, Mohit Bansal, Huaxiu Yao
International Conference on Learning Representations (ICLR), 2025.
[Paper]
|
CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
Peng Xia, Ze Chen, Juanxi Tian, Yangrui Gong, Ruibo Hou, Yue Xu, Zhenbang Wu, Zhiyuan Fan, Yiyang Zhou, Kangyu Zhu, Wenhao Zheng, Zhaoyang Wang, Xiao Wang, Xuchao Zhang, Chetan Bansal, Marc Niethammer, Junzhou Huang, Hongtu Zhu, Yun Li, Jimeng Sun, Zongyuan Ge, Gang Li, James Zou, Huaxiu Yao
The Conference on Neural Information Processing Systems (NeurIPS), 2024
[Paper]
[Code]
[Project Page]
|
|
Patents
-
Model Training Method, Device, Electronic Device and Storage Medium
Peng Xia, Tong Ma, Ming Hu, Lie Ju, Bin Wang, Zongyuan Ge, Dalei Zhang
CN Patent, CN116994100A, 2023.
-
Model Training Method, Fundus Image Prediction Method and Device
Peng Xia, Lie Ju, Ming Hu, Tong Ma, Bin Wang, Kaimin Song, Zongyuan Ge, Dalei Zhang
CN Patent, CN115620384A, 2023.
-
Apparatus and Computer-Readable Storage Medium for Predicting Cognitive Impairment
Peng Xia, Tong Ma, Ming Hu, Lie Ju, Bin Wang, Zongyuan Ge, Dalei Zhang
CN Patent, CN115590481A, 2022.
|
|
Selected Honors & Awards
-
KDD 2025 Health Day Distinguished Vision Award, 2025
-
ICLR Travel Award, 2025
-
ICLR Oral Presentation (Top 1.8%), 2025
-
Third Place, Shanghai-HK Interdisciplinary Shared Tasks (Task 1), 2022
-
Second Price, The 3rd Huawei DIGIX AI Algorithm Contest, 2021
-
Honorable Mention, Mathematics Contest in Modeling, 2021
|
|
Academic Services
-
Area Chair: ACL Rolling Review (ARR) (2025)
-
Journal/Conference Reviewer: NeurIPS (2024-2025), NeurIPS D&B Track (2024-2025), ICML (2024-2025), ICLR (2025), CVPR (2025), ICCV (2025), AAAI (2026), MICCAI (2024-2025), WACV (2025-2026), ACL Rolling Review (ARR) (2024-2025), International Journal of Computer Vision (IJCV), IEEE Transactions on Medical Imaging (TMI), Knowledge-Based Systems (KBS), Expert Systems with Applications (ESWA)
-
Student Volunteer: EMNLP (2024)
-
Workshop Co-Organizer: ICML 2025 Workshop on Reliable and Responsible Foundation Models
|
© Peng Xia | Last updated:
|