returnn
returnn copied to clipboard
Enforce dim tags to be unique in tensor
The order of axes should never matter.
But when a single dim tag can occur multiple times in a tensor (Data
), it does matter. E.g. for operations like SoftmaxOverSpatialLayer
on some input [B,T,T].
You can get such tensor e.g. via DotLayer
.
We should disallow this, so that the order of axes will not matter.
The user explicitly would need to change one dim tag before to some new dim tag (e.g. via #633 or #589).
This would be helpful for #391.
This obviously introduces new behavior. So it could be introduced via a new behavior version (#508).
However, the question is whether this behavior should be enforced on everyone. A new behavior version usually implies for an easy transition of old code to new code. But this is a case where it might be a bit non-trivial in some cases.
Instead, we could introduce this as a behavior flag. Some config option like behavior_unique_dim_tags = True
or so.
We now stumbled upon a case where there is no clear obvious solution on how to avoid ambiguous tags in a tensor, namely for the weight matrix of linear transformations with in_dim == out_dim
. In this case, it would be a (dim, dim)
weight matrix.
This is also shown here: https://github.com/rwth-i6/returnn_common/issues/17#issuecomment-997312167
One potential solution was to introduce some match_priority
: #871