dml

Results 3 issues of dml

### Description system and software: - fastertransformer version: v5.0 - GPU: T4 - Swin-Transformer: e0486b2cf8c63b6314570a43007569c8aa9b4578 - CUDA: 11.0 ### Error Message 1. got `nan` of fp16 inference of swintransformer_op: `FP16_torch_traced_output...

bug

The `window_reverse` does not support dynamic batch because it cast the first dimension of `windows` to integer.

### Motivation Our business model (Internvl 2-26B) outputs very few tokens (1-2 tokens) after prompt optimization, which can be considered as only the prefill stage. Therefore, we hope to use...