TensorRT
                                
                                 TensorRT copied to clipboard
                                
                                    TensorRT copied to clipboard
                            
                            
                            
                        What is the input shape of bertQKVToContextPlugin?
The comment at https://github.com/NVIDIA/TensorRT/blob/main/plugin/bertQKVToContextPlugin/qkvToContextPlugin.cpp#L155
says that the input shape is [B, S, 3*N*H] or [B, S, 3*E]; but the README.md file says the input shape is [S, B, 3*E, 1, 1].
Do the two descriptions confict each other, or I have misunderstood something here? Which should we put at the outmost dimension, batchsize or max sequence length?
@ttyio ^^
@yuc8939 , the input shape is [S, B, 3*E, 1, 1] because we use conv1x1 in the builder, also we plan to rework this to use MM layer, so that the plugin accept [S, B, 3*E]. Before that, please use [S, B, 3*E, 1, 1], thanks!
@ttyio Thanks! And which should we put at the outermost dimension, B or S?
@yuc8939 , use S for the outermost, sample code to follow:
https://github.com/NVIDIA/TensorRT/blob/main/demo/BERT/builder.py#L385
thanks!
Closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!