M. Shoeybi, M. Patwary, R. Puri, P. LeGresley, J. Casper, and B. Catanzaro, “Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism,” Mar. 13, 2020, _arXiv_: arXiv:1909.08053. doi: [10.48550/arXiv.1909.08053](https://doi.org/10.48550/arXiv.1909.08053).
- NLP发展–考虑阅读公众号
- http://arxiv.org/abs/1606.08415 gelu,gpt-2 bert
- Albert: A lite bert for self-supervised learning of language representations bert改进
- waek scaling
[1]
S. Choi, I. Koo, J. Ahn, M. Jeon, and Y. Kwon, “{EnvPipe}: Performance-preserving {DNN} Training Framework for Saving Energy,” presented at the 2023 USENIX Annual Technical Conference (USENIX ATC 23), 2023, pp. 851–864. Accessed: Oct. 24, 2024. [Online]. Available: https://www.usenix.org/conference/atc23/presentation/choi
A. Faiz et al., “LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models,” Jan. 19, 2024, arXiv: arXiv:2309.14393. doi: 10.48550/arXiv.2309.14393.
A. K. Kakolyris, D. Masouros, P. Vavaroutsos, S. Xydis, and D. Soudris, “SLO-aware GPU Frequency Scaling for Energy Efficient LLM Inference Serving,” Aug. 05, 2024, _arXiv_: arXiv:2408.05235. doi: [10.48550/arXiv.2408.05235](https://doi.org/10.48550/arXiv.2408.05235).
|
|
B. Li, S. Samsi, V. Gadepally, and D. Tiwari, “Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, in SC ’23. New York, NY, USA: Association for Computing Machinery, Nov. 2023, pp. 1–15. doi: 10.1145/3581784.3607034.
J. Stojkovic, C. Zhang, Í. Goiri, J. Torrellas, and E. Choukse, “DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency,” Aug. 01, 2024, arXiv: arXiv:2408.00741. doi: 10.48550/arXiv.2408.00741.
infinigen
|
|
B. Li, S. Samsi, V. Gadepally, and D. Tiwari, “Clover: Toward Sustainable AI with Carbon-Aware Machine Learning Inference Service,” in _Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis_, in SC ’23. New York, NY, USA: Association for Computing Machinery, Nov. 2023, pp. 1–15. doi: [10.1145/3581784.3607034](https://doi.org/10.1145/3581784.3607034).
|
|