Question about Self-Forcing-Plus

Open yjhong89 opened this issue 1 month ago • 0 comments

Hi!

I am looking into Self-Forcing-Plus repository implementing self-forcing with dmd.

@GoatWu

There are 2 training pipelines supported in SelfForcinModel, and wonder how SelfForcingTrainingPipeline and BidirectionalTrainingPipeline work differently ??
- https://github.com/GoatWu/Self-Forcing-Plus/blob/f947063ef694ade8f212c1806aed2686bdfa721a/model/base.py#L229
- Why generator_type set differently??
  - https://github.com/GoatWu/Self-Forcing-Plus/blob/f947063ef694ade8f212c1806aed2686bdfa721a/configs/self_forcing_14b_i2v_dmd.yaml#L8
  - https://github.com/GoatWu/Self-Forcing-Plus/blob/f947063ef694ade8f212c1806aed2686bdfa721a/configs/self_forcing_dmd.yaml#L7
Why boundary of high and low noise set up to 0.5? Original wan uses this value as 0.9
- https://github.com/GoatWu/Self-Forcing-Plus/blob/f947063ef694ade8f212c1806aed2686bdfa721a/configs/wan22_high_i2v.yaml#L16
Why recent trends use self-forcing with dmd, not using only dmd without self-forcing or any causality?
- Though read from docs, since DMD was originally supposed to handle image generation, so adopt self-forcing ? (training stability ?)
- Still don't get it why self-forcing (or causvid) is naturally combined with DMD (or other distillation method). Any advice?

If I train self-forcing (w.DMD) algorithm to bidirectional algorithm (e.g. Wan2.1/Wan2.2), then can distilled model act like an auto-regressive video generator??
What is difference between step-distllation and autoregressive distillation in docs?

Thanks.

Nov 24 '25 08:11 yjhong89