Pre-trained models: Past, present and future X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu, Y Yao, A Zhang, ... AI Open 2, 225-250, 2021 | 868 | 2021 |
Ppt: Pre-trained prompt tuning for few-shot learning Y Gu, X Han, Z Liu, M Huang arXiv preprint arXiv:2109.04332, 2021 | 455 | 2021 |
Knowledge distillation of large language models Y Gu, L Dong, F Wei, M Huang arXiv preprint arXiv:2306.08543, 2023 | 224* | 2023 |
CPM: A large-scale generative Chinese pre-trained language model Z Zhang, X Han, H Zhou, P Ke, Y Gu, D Ye, Y Qin, Y Su, H Ji, J Guan, F Qi, ... AI Open 2, 93-99, 2021 | 120 | 2021 |
Adapting meta knowledge graph information for multi-hop reasoning over few-shot relations X Lv, Y Gu, X Han, L Hou, J Li, Z Liu arXiv preprint arXiv:1908.11513, 2019 | 106 | 2019 |
Cpm-2: Large-scale cost-effective pre-trained language models Z Zhang, Y Gu, X Han, S Chen, C Xiao, Z Sun, Y Yao, F Qi, J Guan, P Ke, ... AI Open 2, 216-224, 2021 | 93 | 2021 |
Train no evil: Selective masking for task-guided pre-training Y Gu, Z Zhang, X Wang, Z Liu, M Sun arXiv preprint arXiv:2004.09733, 2020 | 67 | 2020 |
Eva: An open-domain chinese dialogue system with large-scale generative pre-training H Zhou, P Ke, Z Zhang, Y Gu, Y Zheng, C Zheng, Y Wang, CH Wu, H Sun, ... arXiv preprint arXiv:2108.01547, 2021 | 48 | 2021 |
Eva2. 0: Investigating open-domain chinese dialogue systems with large-scale pre-training Y Gu, J Wen, H Sun, Y Song, P Ke, C Zheng, Z Zhang, J Yao, L Liu, X Zhu, ... Machine Intelligence Research 20 (2), 207-219, 2023 | 41 | 2023 |
Structured prompting: Scaling in-context learning to 1,000 examples Y Hao, Y Sun, L Dong, Z Han, Y Gu, F Wei arXiv preprint arXiv:2212.06713, 2022 | 41 | 2022 |
Pre-training to learn in context Y Gu, L Dong, F Wei, M Huang arXiv preprint arXiv:2305.09137, 2023 | 34 | 2023 |
Synthetic data (almost) from scratch: Generalized instruction tuning for language models H Li, Q Dong, Z Tang, C Wang, X Zhang, H Huang, S Huang, X Huang, ... arXiv preprint arXiv:2402.13064, 2024 | 23 | 2024 |
When does further pre-training MLM help? An empirical study on task-oriented dialog pre-training Q Zhu, Y Gu, L Luo, B Li, C Li, W Peng, M Huang, X Zhu Proceedings of the Second Workshop on Insights from Negative Results in NLP …, 2021 | 20 | 2021 |
Cuge: A chinese language understanding and generation evaluation benchmark Y Yao, Q Dong, J Guan, B Cao, Z Zhang, C Xiao, X Wang, F Qi, J Bao, ... arXiv preprint arXiv:2112.13610, 2021 | 15 | 2021 |
Learning instructions with unlabeled data for zero-shot cross-task generalization Y Gu, P Ke, X Zhu, M Huang arXiv preprint arXiv:2210.09175, 2022 | 12 | 2022 |
Instruction Pre-Training: Language Models are Supervised Multitask Learners D Cheng, Y Gu, S Huang, J Bi, M Huang, F Wei arXiv preprint arXiv:2406.14491, 2024 | 10 | 2024 |
Direct preference knowledge distillation for large language models Y Li, Y Gu, L Dong, D Wang, Y Cheng, F Wei arXiv preprint arXiv:2406.19774, 2024 | 1 | 2024 |
Many-Class Text Classification with Matching Y Song, Y Gu, M Huang arXiv preprint arXiv:2205.11409, 2022 | 1 | 2022 |
MiniPLM: Knowledge Distillation for Pre-Training Language Models Y Gu, H Zhou, F Meng, J Zhou, M Huang arXiv preprint arXiv:2410.17215, 2024 | | 2024 |
Data Selection via Optimal Control for Language Models Y Gu, L Dong, H Wang, Y Hao, Q Dong, F Wei, M Huang arXiv preprint arXiv:2410.07064, 2024 | | 2024 |