returnn icon indicating copy to clipboard operation
returnn copied to clipboard

RangeInAxisLayer: beam search bug with SliceNdLayer

Open robin-p-schmitt opened this issue 3 years ago • 2 comments

The network is:

network = {
      "output": {"class": "rec", "from": "data", "unit": {
        "start": {"class": "copy", "from": "prev:output"},
        "slices": {"class": "slice_nd", "from": "base:data", "start": "start", "size": None},  # [B,T[B],slice[B,T],D]
        "slices_red": {"class": "reduce", "from": "slices", "axis": "dyn:-1", "mode": "max"},  # [B,T[B],D]
        "slice_range": {"class": "range_in_axis", "from": "slices", "axis": "dyn:-1", "is_output_layer": True},  # [T[B]]
        "output_prob": {"class": "linear", "from": "slices_red", "activation": "softmax", "n_out": dim},
        "output": {
          "class": "choice", "from": "output_prob", "beam_size": 3, "input_type": "prob", "target": "classes",
          "initial_output": 0}
      }}
    }

The error is:

layer <network via test_SliceNdLayer_RangeInAxisLayer>/'data' output: Data{'data', [B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}
layer <network via test_SliceNdLayer_RangeInAxisLayer>/'output' output: Data{'output_output', [T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5}
Rec layer 'output' (search True, train False) sub net:
  Input layers moved out of loop: (#: 0)
    None
  Output layers moved out of loop: (#: 1)
    slice_range
  Layers in loop: (#: 5)
    slices
    start
    output
    output_prob
    slices_red
  Unused layers: (#: 0)
    None
layer <network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)/'start' output: Data{'start_output', [B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])}
layer <network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)/'slices' output: Data{'slices_gather_output', [B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[?]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}
layer <network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)/':dyn-tag-accum:1:slices' output: Data{'sliced-time:slices:dyn_size', [B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}

ERROR: Got exception during in-loop construction of layer ':dyn-tag-accum:1:slices':
AssertionError: ("Layer <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}> has buggy search choices resolution.", 'see search choices debug output')

output: <_TemplateLayer(ChoiceLayer)(:template:choice) output/'output' out_type=Data{[B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
output_prob: <_TemplateLayer(LinearLayer)(:template:linear) output/'output_prob' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'output_prob:feature-dense'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output')>
slices_red: <_TemplateLayer(ReduceLayer)(:template:reduce) output/'slices_red' out_type=Data{[B&Beam{'output/prev:output'}(3),F|F'feature:slices_red_output'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'output_prob')>
slices: <_TemplateLayer(SliceNdLayer)(:template:slice_nd) output/'slices' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[?]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])} (construction stack 'slices_red')>
start: <_TemplateLayer(CopyLayer)(:template:copy) output/'start' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack 'slices')>
slice_range: <_TemplateLayer(RangeInAxisLayer)(:template:range_in_axis) output/'slice_range' out_type=Data{[T|'sliced-time:slices'[?]{ctx=loop('time:var:extern_data:data'[B])}], dtype='int32', ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
:dyn-tag-accum:1:slices: <_TemplateLayer(LengthLayer)(:template:length) output/':dyn-tag-accum:1:slices' out_type=Data{[B], dtype='int32', ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>

Collected (unique) exceptions during template construction:
(Note that many of these can be ignored, or are expected.)

<returnn.tf.layers.rec._SubnetworkRecCell._construct_template.<locals>.CollectedException object at 0x7efd5849f700>
<returnn.tf.layers.rec._SubnetworkRecCell._construct_template.<locals>.CollectedException object at 0x7efd58820a60>
debug search choices:
  base: <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>
  network:
    layer: <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>
    layer: <RecStepInfoLayer output/':i' out_type=Data{[], dtype='int32'}>
    layer: <_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
    layer: <SliceNdLayer output/'slices' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[B&Beam{'output/prev:output'}(3)]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}>
    layer: <CopyLayer output/'start' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])}>
  visit: <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>, search choices None
    sources: 'output/slices' search choices None
  visit: <SliceNdLayer output/'slices' out_type=Data{[B&Beam{'output/prev:output'}(3),T|'sliced-time:slices'[B&Beam{'output/prev:output'}(3)]{ctx=loop('time:var:extern_data:data'[B])},F|F'feature:data'(5)], ctx=loop('time:var:extern_data:data'[B])}>, search choices None
    sources: 'data' search choices None, 'output/start' search choices None
  visit: <SelectSearchSourcesLayer 'data' <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)> out_type=Data{[B&Beam{'output/prev:output'}(3),T|'time:var:extern_data:data'[B&Beam{'output/prev:output'}(3)],F|F'feature:data'(5)]}>, search choices None
    sources: 'data' search choices None
  visit: <SourceLayer 'data' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}>, search choices None
    sources: None
  visit: <CopyLayer output/'start' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])}>, search choices None
    sources: 'output/prev:output' search choices <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)>
Relevant layers:
[<_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>]
Full dependency map:
{'data': [],
 'output/:dyn-tag-accum:1:slices': ['output/prev:output'],
 'output/slices': ['output/prev:output'],
 'output/start': ['output/prev:output']}
-> search choices: <_TemplateLayer(ChoiceLayer)(:prev:choice) output/'prev:output' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', sparse=True, dim=5, ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
Template network (check out types / shapes):
Exception creating layer <network via test_SliceNdLayer_RangeInAxisLayer>/'output' of class RecLayer with opts:
{'_name': 'output',
 '_network': <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>' train=False search>,
 '_time_dim_tag': DimensionTag{'time:var:extern_data:data'[B]},
 'n_out': <class 'returnn.util.basic.NotSpecified'>,
 'name': 'output',
 'network': <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>' train=False search>,
 'output': Data{'output_output', [T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5},
 'sources': [<SourceLayer 'data' out_type=Data{[B,T|'time:var:extern_data:data'[B],F|F'feature:data'(5)]}>],
 'unit': <_SubnetworkRecCell '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)'>}
EXCEPTION
...
  File "/mnt/projects/i6/returnn/returnn/tf/layers/rec.py", line 1691, in _SubnetworkRecCell._construct
    line: layer = get_layer(layer_name)
    locals:
      layer = <not found>
      get_layer = <local> <function _SubnetworkRecCell._construct.<locals>.get_layer at 0x7efd58591dc0>
      layer_name = <local> ':dyn-tag-accum:1:slices', len = 23
  File "/mnt/projects/i6/returnn/returnn/tf/layers/rec.py", line 1677, in _SubnetworkRecCell._construct.<locals>.get_layer
    line: assert (layer.output.beam == layer_template.output.beam and
                  layer_choices.beam_size == layer.output.beam.beam_size == layer_template.output.beam.beam_size), (
            "Layer %r has buggy search choices resolution." % layer,
            self.net.debug_search_choices(layer) or "see search choices debug output")
    locals:
      layer = <local> <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}>
      layer.output = <local> Data{'sliced-time:slices:dyn_size', [B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}
      layer.output.beam = <local> SearchBeam(name='output/prev:output', beam_size=3)
      layer_template = <local> <_TemplateLayer(LengthLayer)(:template:length) output/':dyn-tag-accum:1:slices' out_type=Data{[B], dtype='int32', ctx=loop('time:var:extern_data:data'[B])} (construction stack None)>
      layer_template.output = <local> Data{':dyn-tag-accum:1:slices_length', [B], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}
      layer_template.output.beam = <local> None
      layer_choices = <local> <SearchChoices owner='prev:output' beam_size=3 beam_scores=shaped:(None,None)>
      layer_choices.beam_size = <local> 3
      layer.output.beam.beam_size = <local> 3
      layer_template.output.beam.beam_size = <local> !AttributeError: 'NoneType' object has no attribute 'beam_size'
      self = <local> <_SubnetworkRecCell '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)'>
      self.net = <local> <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)' parent_layer=<RecLayer 'output' out_type=Data{[T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5}> train=False search>
      self.net.debug_search_choices = <local> <bound method TFNetwork.debug_search_choices of <TFNetwork '<network via test_SliceNdLayer_RangeInAxisLayer>/output(rec-subnet)' parent_layer=<RecLayer 'output' out_type=Data{[T|'time:var:extern_data:data'[B&Beam{'output/output'}(3)],B&Beam{'output/output'}(3)], dtype='int32', sparse=True, dim=5}...
AssertionError: ("Layer <LengthLayer output/':dyn-tag-accum:1:slices' out_type=Data{[B&Beam{'output/prev:output'}(3)], dtype='int32', ctx=loop('time:var:extern_data:data'[B])}> has buggy search choices resolution.", 'see search choices debug output')

The full traceback is here: https://gist.github.com/robin-p-schmitt/fac5efca3d838f2fb265fabb6f3a78a9.

The error does also occur if I choose the static feature axis as the range axis. It does not occur if I set a static slice size. The above network works fine during training.

robin-p-schmitt avatar Nov 23 '21 07:11 robin-p-schmitt

So I think the error was caused by marking slice_range as an output layer, which I don't actually need for my config. I already had this issue some time ago and back then it was just not implemented. See https://github.com/rwth-i6/returnn/pull/635#discussion_r709287770:

Btw, this problem only occurs because you have optimize_move_layers_out=False and also "is_output_layer": True on the segments layer. Why do you have the latter actually?

For your test, you could just put another reduce layer afterwards, which reduces over the new dim. The problem is really that this is specifically an output layer. In principle, this should also work, but this is not implemented yet, but also not needed for your use case.

The segments layer was the same as my slices layer now. This then seems to be the same error, as the loop isn't optimized because we are in search mode and the slice_range layer is marked as an output layer. For me, the issue is solved then, because I don't need slice_range to be an output layer.

robin-p-schmitt avatar Nov 24 '21 16:11 robin-p-schmitt

But then the bug (issue) itself is not gone, only that you actually do not need it to be fixed for your case.

You should not close an issue when it is not solved.

albertz avatar Nov 24 '21 18:11 albertz