bhsueh_NV
bhsueh_NV
Hi, dontloo. Thank you for the feedback and sorry for the confusion. We have removed this feature but not update the document. We will fix it in the next update.
> [bert_guide.md](https://github.com/NVIDIA/FasterTransformer/blob/main/docs/bert_guide.md) section 1.1.2 mentioned this allow_gemm_test flag, but it seems this flag is not effective and not used in the bertExample method in [bert_example.cc](https://github.com/NVIDIA/FasterTransformer/blob/6fddeac5f59ce4df380002aa945da57a0c8e878c/examples/cpp/bert/bert_example.cc#L71). > > Output shows "using...
Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.
Does it support python api? In a similar way to trtexec, transfer the engine model and do inference?
No pure python api. If you want to run on python, you need use PyTorch/TensorFlow custom ops.
Does it support python api? In a similar way to trtexec, transfer the engine model and do inference?
Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.
Can you provide the cmake version and cuda version you use? It seems some problems caused by compiler.
Do you try on the pytorch docker image we recommend?
I cannot compile the codes successfully. Even if I fix the issue, I will get wrong results when I run the hidden_dim > 1024. How do you verify the correctness?
[Here](https://github.com/NVIDIA/FasterTransformer/blob/main/sample/tensorflow/unit_test/bert_encoder_unit_test.py) is a simple unit test. You can add some cases with hidden_dimension > 1024 into the unit test. The request of #104 are supported in [next beta version](https://github.com/NVIDIA/FasterTransformer/tree/dev/v5.0_beta).
For your request and the BERT model, it should be steady. We release it as beta version because: 1. We may break the API again recently. 2. We still not...