returnn
returnn copied to clipboard
CombineLayer mismatches feature dims
I have a case where the inputs to the CombineLayer
are of shape (B, F, T)
and (1, F)
. In this case, it does currently not work to find the matching dims as the 1
axis is mapped to F
of input 1 and then, the F
axis cannot be mapped anymore.
I added a test case in draft PR #720
You probably wouldn't want to have this 1/broadcast axis in the first place. Which layer creates this for you? You shouldn't need these kind of dims anywhere, RETURNN will just broadcast automatically.
That's true. The broadcast axis is created in pytorch-to-returnn.
@albertz if we fix the logic in _unify_tensor_axes_returnn_meta()
this case should not occur anymore and without the broadcast dim, the CombineLayer
works. So I guess we could move the discussion to pytorch-to-returnn.
Can you link the corresponding issue on pytorch-to-returnn?
But despite the recommendation to not manually/explicitly add broadcast axes, I think it should still work.
Or in general, we have the basic principle in RETURNN that the order of axes should never matter. However, in this example, if the input is (B, T, F) + (1, F), it works correctly. So there is a difference depending on the order of axes. Which should not be the case.
I just created an issue and a PR with a testcase in pytorch-to-returnn, see https://github.com/rwth-i6/pytorch-to-returnn/issues/59
My issue was resolved with https://github.com/rwth-i6/pytorch-to-returnn/pull/58. Should we close the issue or do you want to keep it open since it should still work with RETURNN?
My issue was resolved with rwth-i6/pytorch-to-returnn#58. Should we close the issue or do you want to keep it open since it should still work with RETURNN?
No, this problem is not solved on RETURNN. You just avoid the RETURNN problem (which still exists) in pytorch-to-returnn now, so your linked issue is not relevant here anymore.
As mentioned above, I'm not facing this issue in pytorch-to-returnn anymore, so I'm not further looking into this and closed the draft PR with the test case #720. The test case to reproduce the issue from there looked like this:
def test_CombineLayer_feature_broadcast_swapped():
with make_scope() as session:
n_batch, n_time, n_features = 3, 7, 5
net_dict = {
"output": {"class": "combine", "kind": "add", "from": ["data:in1", "data:in2"]},
}
config = Config({
"debug_print_layer_output_template": True,
"extern_data": {
"in1": {"shape": (n_features, None), "batch_dim_axis": 0, "time_dim_axis": 2},
"in2": {"shape": (1, n_features), "batch_dim_axis": None, "time_dim_axis": None, "feature_dim_axis": 1},
}
})
network = TFNetwork(config=config, train_flag=True)
network.construct_from_dict(net_dict)
out = network.get_default_output_layer()
assert_equal(out.output.batch_shape, (None, n_features, None))
feed_dict = make_feed_dict(network.extern_data, n_batch=n_batch, n_time=n_time)
session.run(tf_compat.v1.global_variables_initializer())
out_v = session.run(out.output.placeholder, feed_dict=feed_dict)
assert out_v.shape == (n_batch, n_features, n_time)