2024-5
week2-3(5.6-5.19)
- 调研大预言模型BERT GPT GPT-2 GPT-3
- 调研了system for LLM中的parallelism部分
- data parallelism
- tensor parallelism
- pipeline parallelism
- expert parallelism
- hybrid parallelism
- 看了李沐讲AI关于transformer、GNN的一些视频
week4(5.20-5.26)
- 调研memory optimization这部分:
- memory swap
- zero redundancy
- mixed precision training
- checkpoint and recomputation
- 尝试在服务器上部署llama2,3
- 读一下gpt-4的technical report
- 调研LLM对齐技术