simformer icon indicating copy to clipboard operation
simformer copied to clipboard

Inconsistent sizes between training data and validation data

Open hoangdzung opened this issue 9 months ago • 0 comments

Hi, thank you so much for the great work and for sharing the very well-structured codebase!

I encountered the following error when running scoresbi task=lotka_volterra method=score_transformer(_undirected)

Traceback (most recent call last): File ".../simformer/src/scoresbibm/scoresbibm/scripts/hydra_script.py", line 95, in score_sbi model = method_run(task,data, cfg.method, rng=rng_train) File ".../simformer/src/scoresbibm/scoresbibm/methods/method_base.py", line 95, in run_score_transformer model = train_transformer_model(task, data, method_cfg, rng) File ".../simformer/src/scoresbibm/scoresbibm/methods/score_transformer.py", line 305, in train_transformer_model params, opt_state = run_train_transformer_model( File ".../simformer/src/scoresbibm/scoresbibm/methods/score_transformer.py", line 136, in run_train_transformer_model l_val_batch = loss_fn( File ".../simformer/src/scoresbibm/scoresbibm/methods/score_transformer.py", line 271, in loss_fn edge_mask = jax.vmap(edge_mask_fn, in_axes=(None, 0, 0))(node_id, condition_mask, meta_data) ValueError: vmap got inconsistent sizes for array axes to be mapped:

  • one axis had size 1000: axis 0 of argument condition_mask of type bool[1000,44];
  • one axis had size 304: axis 0 of argument args[0] of type float32[304,1]

This might be caused by subsampling applied during training but not during evaluation. Specifically, condition_mask is computed based on x_dim = 40 (i.e., after subsampling), whereas the evaluation data has a dimensionality of 300.

Would you recommend also applying subsampling to the validation data, or should the x_dim (and corresponding masks like edge_mask) be made adaptable to the actual data shape during evaluation?

Thanks again for your time and the excellent work!

hoangdzung avatar Apr 07 '25 21:04 hoangdzung