机器学习分类

2023

12-31

Megatron-LM源码系列(六)：Distributed-Optimizer分布式优化器实现Part1

12-23

FP16数据格式详解

12-21

Megatron-LM源码系列(五)： FP16使用

10-17

Causal Attention论文详解

09-25

Megatron-LM源码系列(四)：重计算(recompute)

08-15

Pytorch LayerNorm源码详解

08-06

Grouped Query Attention论文阅读

07-29

LLaMA-2论文阅读

07-28

Megatron-LM源码系列(三)：详解Pipeline模型并行训练实现

07-23

Megatron-LM源码系列(二)：Tensor模型并行和Sequence模型并行训练