Seuraa
Yuansheng Ni
Yuansheng Ni
Vahvistettu sähköpostiosoite verkkotunnuksessa uwaterloo.ca - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
X Yue, Y Ni, K Zhang, T Zheng, R Liu, G Zhang, S Stevens, D Jiang, ...
🏆 CVPR 2024 (Best Paper Finalist), 2023
5112023
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Y Wang, X Ma, G Zhang, Y Ni, A Chandra, S Guo, W Ren, A Arulraj, X He, ...
🏆 NeurIPS D&B 2024 (Spotlight), 2024
111*2024
A Comprehensive Study of Knowledge Editing for Large Language Models
N Zhang, Y Yao, B Tian, P Wang, S Deng, M Wang, Z Xi, S Mao, J Zhang, ...
arXiv preprint arXiv:2401.01286, 2024
97*2024
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
P Wang, N Zhang, B Tian, Z Xi, Y Yao, Z Xu, M Wang, S Mao, X Wang, ...
ACL SDT 2024, 2023
79*2023
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
X Yue, T Zheng, Y Ni, Y Wang, K Zhang, S Tong, Y Sun, M Yin, B Yu, ...
arXiv preprint arXiv:2409.02813, 2024
262024
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
X He, D Jiang, G Zhang, M Ku, A Soni, S Siu, H Chen, A Chandra, Z Jiang, ...
EMNLP Main 2024, 2024
192024
GenAI Arena: An Open Evaluation Platform for Generative Models
D Jiang, M Ku, T Li, Y Ni, S Sun, R Fan, W Chen
NeurIPS D&B 2024, 2024
92024
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
J Chen, T Liang, S Siu, Z Wang, K Wang, Y Wang, Y Ni, W Zhu, Z Jiang, ...
arXiv preprint arXiv:2410.10563, 2024
32024
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models
Z Liu, F Fang, X Feng, X Du, C Zhang, Z Wang, Y Bai, Q Zhao, L Fan, ...
NeurIPS D&B 2024, 2024
32024
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–9