`get_out_data_from_opts` covers too much logic
The idea was simple: based on the inputs/kwargs, determine the output Data type (without actually computing the tensor).
This was mostly about dtype, shape and dim.
Over time, Data was extended by more and more logic such as handling beam search and special logic of the batch dim.
Much of this extra logic is the same for every layer, and still we need to duplicate the code/logic in every layer.
Examples:
out.beam = SearchBeam.get_combined_beam(...)- (
batchis already post-processed inTFNetwork._create_layer) - Explicit
out_typeorn_outmight overwrite or explicitly set some attribs. (#542) This can be needed in recurrent constructions such asx: {class: eval, from: "prev:x", eval: "source(0) + 1"}where you might want to set some customdtypeor so. Currently onlyCopyLayer.get_out_data_from_optsand the baseLayerBase.get_out_data_from_optshandle this. Layers likeLinearLayerdo not have an ownget_out_data_from_optsbecause the base logic covers this.
We might want to decouple this logic:
- One function which computes
dtype,shape&dim&size_placeholder(or maybedtypealso separated). (In most cases, sizes (size_placeholder) would just be copied. In more rare cases, new sizes could be introduces, likeConvLayeretc. I'm not sure if this needs yet another separate logic.) beamis almost alwaysSearchBeam.get_combined_beamof all deps (inputsData, layers, targets), except for layers likeChoiceLayerbatchis almost alwaysBatchInfo.get_common_batch_infoof all deps- One function which handles the logic of custom overwrites by
out_typeorn_out(#542)
I'm not exactly sure how it would look like. Maybe the function for dtype/shape/sizes could also just return a Data but not care about beam/batch.
Maybe there could then be separate functions LayerBase.get_out_beam and LayerBase.get_out_batch and only those layers which do sth non-standard would overwrite them.
This is a bit open for discussion. The main purpose is to simplify the code, to make it more straight-forward, and to make it more consistent for edge cases.
E.g. currently when you specify out_type, some layers would just ignore it, some layers would at least check it, some layers would use the information to overwrite the output. (#542)