returnn icon indicating copy to clipboard operation
returnn copied to clipboard

Add test for optimize_out_slice_nd

Open mikel-zhobro opened this issue 4 years ago • 5 comments
trafficstars

This is an example case which shows that slice_nd layer doesn't get properly optimized out of the loop.

This pull request is meant to fix that.

For clarification, we want to generalize SliceNdLayer to not only work on start of shape (batch,) but cover even the cases with several time axis.

mikel-zhobro avatar Jun 13 '21 11:06 mikel-zhobro

I don't think GatherLayer is a good base.

On Mon, Jun 14, 2021 at 11:00 AM Mikel Zhobro @.***> wrote:

https://github.com/rwth-i6/returnn/blob/6410df5eeb26cd9d77635a6e2af5a8330083198b/returnn/tf/layers/basic.py#L940

already offers the logic of slice_nd but is a little more general.

I plan to use GatherLayer as base class and add the required constraints/checks for slice_nd in its init().

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rwth-i6/returnn/pull/543#issuecomment-860521217, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAON7H7T3CU7SOE4L645XTTSXAMNANCNFSM46TVFJYQ .

albertz avatar Jun 14 '21 14:06 albertz

I've added the tf implementation of the new slice_nd2 in util/basic and some tests. I haven't yet cleaned the old ones out.

Before implementing the RETURNN layer, I wanted to clarify if we want to allow a slice_axis as input param to the function, similar to the GatherLayer. If we don't, the user has to make sure that the axis of input and start have the same order.

I.e.

"""
  :param tf.Tensor input: shape (B, T1, ..., Tn, D)
  :param tf.Tensor start: shape (B,T1 .., Tn-1), int32 which automatically indicates n as the slice-axis
"""

mikel-zhobro avatar Jun 20 '21 14:06 mikel-zhobro

I don't really understand why you would want that? I would assume this layer (like GatherNdLayer) only makes sense like this here. Do not care about other cases, esp not if they don't really make sense.

Mikel Zhobro @.***> schrieb am Fr., 25. Juni 2021, 15:17:

@.**** commented on this pull request.

In tests/test_TFNetworkRecLayer.py https://github.com/rwth-i6/returnn/pull/543#discussion_r658756106:

  •  "window": {"class": "slice_nd2",    # no_opt: [B,4,D], opt: [B,T,4,D]
    
  •             "from": "base:data", "start": "data:source", "size": 4, "is_output_layer": True},
    

Is there any way how to make it dependent? For example copy it into an intermediate layer or so?

Otherwise, I don't know how making it work for all cases is possible. Maybe making sure that the input has one more time axis than start or tf.tile it otherwise?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rwth-i6/returnn/pull/543#discussion_r658756106, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAON7BRD77M6BH3TZBMJSTTUR6WXANCNFSM46TVFJYQ .

albertz avatar Jun 25 '21 14:06 albertz

Can you post here what the actual problem/error is when you use these test cases with the original slice_nd? It is not really clear to me why there is actually a problem. (Just post the error with stack trace.)

albertz avatar Jun 26 '21 21:06 albertz

(Just post the error with stack trace.)

The Error strace you asked. The thing is that the slice_nd we had, expects start to have only one batch axis [B], but if slice_nd gets pulled out, so does start and instead of shape [B], start has shape [B,T]. So, it requires a loop over T. That's what I added with this commit.

mikel-zhobro avatar Jun 28 '21 09:06 mikel-zhobro