OneRec Decoder

- point-wise generation paradigm 点式生成范式
- 解码器输入 #card
- learnable beginning-of-sequence token with the video’s semantic identifiers
- Emb_lookup
- 通过 Ldec 层 Transformer layer 处理序列 #card

- Mixture of Experts (MoE) feed-forward network #card
- top-k 路由策略
- ,
- loss-free load balancing strategy
- 训练目标 #card
- NTP 交叉熵损失
- cross-entropy loss for next-token prediction on the semantic identifiers of target video m
网络回响
OneRec Decoder