returnn
returnn copied to clipboard
RangeInAxisLayer: beam search bug with SliceNdLayer
The network is:
network = {
"output": {"class": "rec", "from": "data", "unit": {
"start": {"class": "copy", "from": "prev:output"},
"slices": {"class": "slice_nd", "from": "base:data", "start": "start", "size": None}, # [B,T[B],slice[B,T],D]
"slices_red": {"class": "reduce", "from": "slices", "axis": "dyn:-1", "mode": "max"}, # [B,T[B],D]
"slice_range": {"class": "range_in_axis", "from": "slices", "axis": "dyn:-1", "is_output_layer": True}, # [T[B]]
"output_prob": {"class": "linear", "from": "slices_red", "activation": "softmax", "n_out": dim},
"output": {
"class": "choice", "from": "output_prob", "beam_size": 3, "input_type": "prob", "target": "classes",
"initial_output": 0}
}}
}
The error is:
layer <network via test_SliceNdLayer_RangeInAxisLayer>/'data' output: Data{'data', [B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}
layer <network via test_SliceNdLayer_RangeInAxisLayer>/'output' output: Data{'output_output', [T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5}
Rec layer 'output' (search True, train False) sub net:
Input layers moved out of loop: (#: 0)
None
Output layers moved out of loop: (#: 1)
slice_range
Layers in loop: (#: 5)
slices
start
output
output_prob
slices_red
Unused layers: (#: 0)
None
layer <network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)/'start' output: Data{'start_output', [B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])}
layer <network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)/'slices' output: Data{'slices_gather_output', [B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[?]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}
layer <network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)/':dyn-tag-accum:1:slices' output: Data{'sliced-time:slices:dyn_size', [B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}
ERROR: Got exception during in-loop construction of layer ':dyn-tag-accum:1:slices':
AssertionError: ("Layer <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}> has buggy search choices resolution.", 'see search choices debug output')
output: <_TemplateLayer(ChoiceLayer)(:template:choice) output/'output' out_type=Data{[B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
output_prob: <_TemplateLayer(LinearLayer)(:template:linear) output/'output_prob' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'output_prob:feature-dense'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output')>
slices_red: <_TemplateLayer(ReduceLayer)(:template:reduce) output/'slices_red' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'feature:slices_red_output'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output_prob')>
slices: <_TemplateLayer(SliceNdLayer)(:template:slice_nd) output/'slices' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[?]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'slices_red')>
start: <_TemplateLayer(CopyLayer)(:template:copy) output/'start' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack 'slices')>
slice_range: <_TemplateLayer(RangeInAxisLayer)(:template:range_in_axis) output/'slice_range' out_type=Data{[T|'sliced-time:slices'[?]{ctx=loop('time:var:extern_data:data'[B])}], dtype='int32', ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
:dyn-tag-accum:1:slices: <_TemplateLayer(LengthLayer)(:template:length) output/':dyn-tag-accum:1:slices' out_type=Data{[B], dtype='int32', ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
Collected (unique) exceptions during template construction:
(Note that many of these can be ignored, or are expected.)
<returnn.tf.layers.rec._SubnetworkRecCell._construct_template.<locals>.CollectedException object at 0x7efd5849f700>
<returnn.tf.layers.rec._SubnetworkRecCell._construct_template.<locals>.CollectedException object at 0x7efd58820a60>
debug search choices:
base: <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>
network:
layer: <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>
layer: <RecStepInfoLayer output/':i' out_type=Data{[], dtype='int32'}>
layer: <_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
layer: <SliceNdLayer output/'slices' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[B&Beam{'output/prev:output'}(3)]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}>
layer: <CopyLayer output/'start' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])}>
visit: <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>, search choices None
sources: 'output/slices' search choices None
visit: <SliceNdLayer output/'slices' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[B&Beam{'output/prev:output'}(3)]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}>, search choices None
sources: 'data' search choices None, 'output/start' search choices None
visit: <SelectSearchSourcesLayer 'data' <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)> out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}>, search choices None
sources: 'data' search choices None
visit: <SourceLayer 'data' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}>, search choices None
sources: None
visit: <CopyLayer output/'start' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])}>, search choices None
sources: 'output/prev:output' search choices <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)>
Relevant layers:
[<_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>]
Full dependency map:
{'data': [],
'output/:dyn-tag-accum:1:slices': ['output/prev:output'],
'output/slices': ['output/prev:output'],
'output/start': ['output/prev:output']}
-> search choices: <_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
Template network (check out types / shapes):
Exception creating layer <network via test_SliceNdLayer_RangeInAxisLayer>/'output' of class RecLayer with opts:
{'_name': 'output',
'_network': <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>' train=False search>,
'_time_dim_tag': DimensionTag{'time:var:extern_data:data'[B]},
'n_out': <class 'returnn.util.basic.NotSpecified'>,
'name': 'output',
'network': <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>' train=False search>,
'output': Data{'output_output', [T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5},
'sources': [<SourceLayer 'data' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}>],
'unit': <_SubnetworkRecCell '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)'>}
EXCEPTION
...
File "/mnt/projects/i6/returnn/returnn/tf/layers/rec.py", line 1691, in _SubnetworkRecCell._construct
line: layer = get_layer(layer_name)
locals:
layer = <not found>
get_layer = <local> <function _SubnetworkRecCell._construct.<locals>.get_layer at 0x7efd58591dc0>
layer_name = <local> ':dyn-tag-accum:1:slices', len = 23
File "/mnt/projects/i6/returnn/returnn/tf/layers/rec.py", line 1677, in _SubnetworkRecCell._construct.<locals>.get_layer
line: assert (layer.output.beam == layer_template.output.beam and
layer_choices.beam_size == layer.output.beam.beam_size == layer_template.output.beam.beam_size), (
"Layer %r has buggy search choices resolution." % layer,
self.net.debug_search_choices(layer) or "see search choices debug output")
locals:
layer = <local> <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>
layer.output = <local> Data{'sliced-time:slices:dyn_size', [B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}
layer.output.beam = <local> SearchBeam(name='output/prev:output', beam_size=3)
layer_template = <local> <_TemplateLayer(LengthLayer)(:template:length) output/':dyn-tag-accum:1:slices' out_type=Data{[B], dtype='int32', ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
layer_template.output = <local> Data{':dyn-tag-accum:1:slices_length', [B], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}
layer_template.output.beam = <local> None
layer_choices = <local> <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)>
layer_choices.beam_size = <local> 3
layer.output.beam.beam_size = <local> 3
layer_template.output.beam.beam_size = <local> !AttributeError: 'NoneType' object has no attribute 'beam_size'
self = <local> <_SubnetworkRecCell '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)'>
self.net = <local> <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)' parent_layer=<RecLayer 'output' out_type=Data{[T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5}> train=False search>
self.net.debug_search_choices = <local> <bound method TFNetwork.debug_search_choices of <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)' parent_layer=<RecLayer 'output' out_type=Data{[T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5}...
AssertionError: ("Layer <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}> has buggy search choices resolution.", 'see search choices debug output')
The full traceback is here: https://gist.github.com/robin-p-schmitt/fac5efca3d838f2fb265fabb6f3a78a9.
The error does also occur if I choose the static feature axis as the range axis. It does not occur if I set a static slice size. The above network works fine during training.
So I think the error was caused by marking slice_range
as an output layer, which I don't actually need for my config. I already had this issue some time ago and back then it was just not implemented. See https://github.com/rwth-i6/returnn/pull/635#discussion_r709287770:
Btw, this problem only occurs because you have optimize_move_layers_out=False and also "is_output_layer": True on the segments layer. Why do you have the latter actually?
For your test, you could just put another reduce layer afterwards, which reduces over the new dim. The problem is really that this is specifically an output layer. In principle, this should also work, but this is not implemented yet, but also not needed for your use case.
The segments
layer was the same as my slices
layer now. This then seems to be the same error, as the loop isn't optimized because we are in search mode and the slice_range
layer is marked as an output layer.
For me, the issue is solved then, because I don't need slice_range
to be an output layer.
But then the bug (issue) itself is not gone, only that you actually do not need it to be fixed for your case.
You should not close an issue when it is not solved.