Rerender_A_Video
Rerender_A_Video copied to clipboard
DDIM sampling do not use x0_strength, and always iterate for ddim_steps=20, time consuming.
https://github.com/williamyang1991/Rerender_A_Video/blob/dfaf9d8825f226a2f0a0b731ab2adc84a3f2ebd2/src/ddim_v_hacked.py#L300 When the x0_strength is small, it will be a long waste of time to reach https://github.com/williamyang1991/Rerender_A_Video/blob/dfaf9d8825f226a2f0a0b731ab2adc84a3f2ebd2/src/ddim_v_hacked.py#L306 Previous blended img is useless, https://github.com/williamyang1991/Rerender_A_Video/blob/dfaf9d8825f226a2f0a0b731ab2adc84a3f2ebd2/src/ddim_v_hacked.py#L308 Maybe we can directly:
for i, step in enumerate(time_range):
index = total_steps - i - 1
if strength >= 0 and i == int(total_steps * strength) and x0 is not None:
ts = torch.full((b, ), step, device=device, dtype=torch.long)
break
img = self.model.q_sample(x0, ts)
to get xt, and then denoise from this timestep. For controller, we always fetch the last item:
x0 = F.instance_norm(x0) * self.step_store['first_ada'][-1] + self.step_store['first_ada'][-2]