ai-imu-dr unmatched iekfnets.p from dropbox

Hello, The iekfnets.p file downloaded from wget "https://www.dropbox.com/s/77kq4s7ziyvsrmi/temp.zip" does not correspond with the MesNet structure. The fourth layer has different size and apparently the iekfnets.p contains more layes than the 8 layes' MesNet we see in utils_torch_filter.py

Jul 28 '23 12:07 enguang2

I've met the same problem! as followings:

RuntimeError: Error(s) in loading state_dict for TORCHIEKF: Unexpected key(s) in state_dict: "mes_net.cov_net.8.weight", "mes_net.cov_net.8.bias", "mes_net.cov_net.12.weight", "mes_net.cov_net.12.bias", "mes_net.cov_net.16.weight", "mes_net.cov_net.16.bias". size mismatch for mes_net.cov_net.4.weight: copying a param with shape torch.Size([64, 32, 5]) from checkpoint, the shape in current model is torch.Size([32, 32, 5]). size mismatch for mes_net.cov_net.4.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).

/home/ccsmm/git_repository/ai-imu-dr/src/utils_torch_filter.py(466)load() 465 mondict = torch.load(path_iekf) --> 466 self.load_state_dict(mondict) 467 cprint("IEKF nets loaded", 'green')

In debugging, ipdb> mondict.keys() odict_keys(['initprocesscov_net.factor_initial_covariance.weight', 'initprocesscov_net.factor_process_covariance.weight', 'mes_net.cov_net.0.weight', 'mes_net.cov_net.0.bias', 'mes_net.cov_net.4.weight', 'mes_net.cov_net.4.bias', 'mes_net.cov_net.8.weight', 'mes_net.cov_net.8.bias', 'mes_net.cov_net.12.weight', 'mes_net.cov_net.12.bias', 'mes_net.cov_net.16.weight', 'mes_net.cov_net.16.bias', 'mes_net.cov_lin.0.weight', 'mes_net.cov_lin.0.bias'])

But the self is not the same as the network. ipdb> self TORCHIEKF( (initprocesscov_net): InitProcessCovNet( (factor_initial_covariance): Linear(in_features=1, out_features=6, bias=False) (factor_process_covariance): Linear(in_features=1, out_features=6, bias=False) (tanh): Tanh() ) (mes_net): MesNet( (tanh): Tanh() (cov_net): Sequential( (0): Conv1d(6, 32, kernel_size=(5,), stride=(1,)) (1): ReplicationPad1d((4, 4)) (2): ReLU() (3): Dropout(p=0.5) (4): Conv1d(32, 32, kernel_size=(5,), stride=(1,), dilation=(3,)) (5): ReplicationPad1d((4, 4)) (6): ReLU() (7): Dropout(p=0.5) ) (cov_lin): Sequential( (0): Linear(in_features=32, out_features=2, bias=True) (1): Tanh() ) ) ) ipdb

Aug 31 '23 11:08 ccsmm78

@ccsmm78 I would guess the URL points to the parameter file hasn't been updated and the parameter file actually has been altered since firstly published. Nevertheless, I am trying to train the model locally yet the training stops at early epoch(7 epoch), have you encountered same case? :)

Sep 04 '23 00:09 enguang2

@enguang2 I have the same problem. I will give info about progress later.