Bharat Venkitesh comments

Results 18 comments of


                                            Bharat Venkitesh

training (COCO)

Is the issue with glove.6B.zip? If that is the case, 1) Delete the zip from .vector_cache/ 2) Run the scrip again, if the error persists, 3) Download it manually and...

Possible Bug in Context Likelihood

If the input is `N` tokens, the log likelihood will be a `N-1` tokens, explained below. log likelihood of token is defined as the log of probability of prediciting that...

INT8 Support for GPT models

Thanks for getting back. 1. Can you elaborate on why there is an overhead in GPT case while BERT numbers suggest gains? 2. Even if there are overheads, it would...

INT8 Support for GPT models

> The cost of using QAT on GPT model is high, so this feature is still under discussing. When you mean cost here, you mean the training cost and not...

INT8 Support for GPT models

> > When you mean cost here, you mean the training cost and not inferece cost? Are there any inference cost (overhead) that may occur for int8 GPT inference? >...

sampling using a single top_k_top_p kernel for all cases

This is the sampling steps we would like to implement ``` sort logits-> sorted_logits, sorted_logits_id if k=0, p>0 { softmax sorted_logits sample from distribution -> inverse transform sampling, U[0,1] return...

sampling using a single top_k_top_p kernel for all cases

We were able to solve it, can close the issue. https://github.com/NVIDIA/FasterTransformer/issues/234#issuecomment-1195532887

Drop in Performance with TP>1, batch size and long generations

We also anlaysed batching, we find that for batch >8, the accuracy is a bit stochastic between test, whereas for batch

Drop in Performance with TP>1, batch size and long generations

We built on top of the dev_5.0 branch, are there any chnages in main that is different to dev_5.0 branch. Important changes we made are 1. We get the logits...

Drop in Performance with TP>1, batch size and long generations

Thanks for the quick reply! I will try merging with main and test it!