Advancing transformer architecture in long-context large language models: A comprehensive survey

Published in arXiv preprint arXiv:2311.12351, 2023

Recommended citation: Huang, Yunpeng; Xu, Jingwei; Lai, Junyu; Jiang, Zixu; Chen, Taolue; Li, Zenan; Yao, Yuan; Ma, Xiaoxing; Yang, Lijuan; Chen, Hao;. (2023). Advancing transformer architecture in long-context large language models: A comprehensive survey. arXiv preprint arXiv:2311.12351