Learning from the Self-future: On-policy Self-distillation for dLLMs

Yifu Luo et al.

4 minOn-policy Self-distillation (opsd)
0:00 / 4:20