Vid-ODE icon indicating copy to clipboard operation
Vid-ODE copied to clipboard

Why has the ConvGRU only one layer?

Open williambittner1 opened this issue 8 months ago • 0 comments

From:

conv_odegru.py (line 113):

first_point_mu, first_point_std = self.encoder_z0(input_tensor=e_truth, time_steps=truth_time_steps, mask=mask, tracker=self.tracker)

base_conv_gru.py (line 127):

last_yi, latent_ys = self.run_ode_conv_gru( input_tensor=input_tensor, mask=mask, time_steps=time_steps, run_backwards=self.run_backwards, tracker=tracker)

and line 161

inc = self.z0_diffeq_solver.ode_func(prev_t, prev_input_tensor) * (t_i - prev_t)

as well as line 180:

# only 1 now yi = self.cell_list[0](input_tensor=xi, h_cur=yi_ode, mask=mask[:, i])

and conv_odegru.py (line 59):

self.encoder_z0 = Encoder_z0_ODE_ConvGRU(input_size=input_size, input_dim=base_dim, hidden_dim=base_dim, kernel_size=(3, 3), num_layers=1, dtype=torch.cuda.FloatTensor if self.device == 'cuda' else torch.FloatTensor, batch_first=True, bias=True, return_all_layers=True, z0_diffeq_solver=z0_diffeq_solver, run_backwards=self.opt.run_backwards).to(self.device) where num_layers=1

it seems that the ode solver is only applied to a single latent "layer" from the ConvGRU. The ConvGRU is not stacked and only operates on a latent with a spatial resolution of 32x32 since num_layers in Encodeer_z0_ODE_ConvGRU is 1.

Why is there no further layers used? Isn't that the whole point of a ConvGRU to utilize multiple resolution latents? Or am I missing something?

williambittner1 avatar Jun 27 '24 18:06 williambittner1