The rise and potential of large language model based agents: A survey Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ... arXiv preprint arXiv:2309.07864, 2023 | 280 | 2023 |
Delve into ppo: Implementation matters for stable rlhf R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Y Zhou, ... NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023 | 33* | 2023 |
Self-polish: Enhance reasoning in large language models via problem refinement Z Xi, S Jin, Y Zhou, R Zheng, S Gao, T Gui, Q Zhang, X Huang arXiv preprint arXiv:2305.14497, 2023 | 15 | 2023 |
Safety and Ethical Concerns of Large Language Models Z Xi, R Zheng, T Gui Proceedings of the 22nd Chinese National Conference on Computational …, 2023 | 11 | 2023 |
Secrets of rlhf in large language models part ii: Reward modeling B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ... arXiv preprint arXiv:2401.06080, 2024 | 10 | 2024 |
Efficient Adversarial Training with Robust Early-bird Tickets Z Xi, R Zheng, T Gui, Q Zhang, X Huang The 2022 Conference on Empirical Methods in Natural Language Processing, 2022 | 9 | 2022 |
Loramoe: Revolutionizing mixture of experts for maintaining world knowledge in language model alignment S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou, Z Xi, X Wang, ... arXiv preprint arXiv:2312.09979, 2023 | 8 | 2023 |
Towards understanding the capability of large language models on code clone detection: a survey S Dou, J Shan, H Jia, W Deng, Z Xi, W He, Y Wu, T Gui, Y Liu, X Huang arXiv preprint arXiv:2308.01191, 2023 | 5 | 2023 |
Characterizing the impacts of instances on robustness R Zheng, Z Xi, Q Liu, W Lai, T Gui, Q Zhang, XJ Huang, J Ma, Y Shan, ... Findings of the Association for Computational Linguistics: ACL 2023, 2314-2332, 2023 | 3 | 2023 |
Improving generalization of alignment with human preferences through group invariant learning R Zheng, W Shen, Y Hua, W Lai, S Dou, Y Zhou, Z Xi, X Wang, H Huang, ... arXiv preprint arXiv:2310.11971, 2023 | 2 | 2023 |
Connectivity Patterns are Task Embeddings Z Xi, R Zheng, Y Zhang, XJ Huang, Z Wei, M Peng, M Sun, Q Zhang, T Gui Findings of the Association for Computational Linguistics: ACL 2023, 11993-12013, 2023 | 2 | 2023 |
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning Z Xi, W Chen, B Hong, S Jin, R Zheng, W He, Y Ding, S Liu, X Guo, ... arXiv preprint arXiv:2402.05808, 2024 | 1 | 2024 |
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback S Dou, Y Liu, H Jia, L Xiong, E Zhou, J Shan, C Huang, W Shen, X Fan, ... arXiv preprint arXiv:2402.01391, 2024 | 1 | 2024 |
Trace: A comprehensive benchmark for continual learning in large language models X Wang, Y Zhang, T Chen, S Gao, S Jin, X Yang, Z Xi, R Zheng, Y Zou, ... arXiv preprint arXiv:2310.06762, 2023 | 1 | 2023 |
ORTicket: Let One Robust BERT Ticket Transfer across Different Tasks Y Zhou, W Chen, R Zheng, Z Xi, T Gui, Q Zhang, XJ Huang Proceedings of the 2024 Joint International Conference on Computational …, 2024 | | 2024 |
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models W He, S Liu, J Zhao, Y Ding, Y Lu, Z Xi, T Gui, Q Zhang, X Huang arXiv preprint arXiv:2404.00884, 2024 | | 2024 |
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals R Zheng, Y Zhou, Z Xi, T Gui, Q Zhang, X Huang arXiv preprint arXiv:2403.16176, 2024 | | 2024 |
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models W Zhou, X Wang, L Xiong, H Xia, Y Gu, M Chai, F Zhu, C Huang, S Dou, ... arXiv preprint arXiv:2403.12171, 2024 | | 2024 |
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions Y Zhang, X Wang, Z Xi, H Xia, T Gui, Q Zhang, X Huang arXiv preprint arXiv:2402.16431, 2024 | | 2024 |
MouSi: Poly-Visual-Expert Vision-Language Models X Fan, T Ji, C Jiang, S Li, S Jin, S Song, J Wang, B Hong, L Chen, ... arXiv preprint arXiv:2401.17221, 2024 | | 2024 |