Bharat Venkitesh issues

Results 5 issues of


                                            Bharat Venkitesh

INT8 Support for GPT models

I see that there is full int8 support (both weights and activations) for BERT, its not clear to me what is supported for GPT models ([here](https://github.com/NVIDIA/FasterTransformer/blob/main/examples/pytorch/gpt/utils/parallel_gpt.py#L28)). Ideally if we can...

Possible Bug in Context Likelihood

When calculating the log likelihood of token at position i, we should consider the logits at step i-1 and also log likelihood of starting token is undefined (can be set...

Drop in Performance with TP>1, batch size and long generations

Noticed a small drop in performance (

[Feature Request] Support for token streaming

I could not find in the doc, adding token streaming support during generation for GPT models would be great.

enhancement

TensorRT saved model too large to use with TFServing

Versions: Tensorflow- 2.3.0-rc1 CUDA-10 TensorRT-6 I am trying to convert a GPT2 model, the saved model size is about 1.9GB. It causes an issue when I try to use TF...