LlamaDecoderLayer内部结构

(self_attn): LlamaAttention(
  (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
  (k_proj): Linear(in_features=2048, out_features=512, bias=False)
  (v_proj): Linear(in_features=2048, out_features=512, bias=False)
  (o_proj): Linear(in_features=2048, out_features=2048, bias=False)
  (rotary_emb): LlamaRotaryEmbedding()
)
(mlp): LlamaMLP(
  (gate_proj): Linear(in_features=2048, out_features=8192, bias=False)
  (up_proj): Linear(in_features=2048, out_features=8192, bias=False)
  (down_proj): Linear(in_features=8192, out_features=2048, bias=False)
  (act_fn): SiLU()
)
(input_layernorm): LlamaRMSNorm((2048,), eps=1e-05)
(post_attention_layernorm): LlamaRMSNorm((2048,), eps=1e-05)

LLM智能应用开发

LLM结构的学习路径

Transformer经典结构

LlaMA的模型结构

HF LlaMA模型结构

LlamaDecoderLayer内部结构

本次课程关注

Input embedding

Input embedding原理

Input embedding原理

Input embedding原理

Input embedding原理

为LLM构建词汇表

Tokenization方式

来试试LlaMA3的Tokenizer

位置编码 (Positional embeddings)

位置编码的初衷

绝对位置编码

位置编码与序数编码的关联

序数的周期性

Sinusodial PE

旋转位置编码（Rotary PE）

Rotary PE的2D理解

RoPE实现

Rotary PE的可视化展示

RoPE在LlaMA中的构建

拓展阅读&参考文档