Bharat Venkitesh

Results 18 comments of Bharat Venkitesh

Is the issue with glove.6B.zip? If that is the case, 1) Delete the zip from .vector_cache/ 2) Run the scrip again, if the error persists, 3) Download it manually and...

If the input is `N` tokens, the log likelihood will be a `N-1` tokens, explained below. log likelihood of token is defined as the log of probability of prediciting that...

Thanks for getting back. 1. Can you elaborate on why there is an overhead in GPT case while BERT numbers suggest gains? 2. Even if there are overheads, it would...

> The cost of using QAT on GPT model is high, so this feature is still under discussing. When you mean cost here, you mean the training cost and not...

> > When you mean cost here, you mean the training cost and not inferece cost? Are there any inference cost (overhead) that may occur for int8 GPT inference? >...

This is the sampling steps we would like to implement ``` sort logits-> sorted_logits, sorted_logits_id if k=0, p>0 { softmax sorted_logits sample from distribution -> inverse transform sampling, U[0,1] return...

We were able to solve it, can close the issue. https://github.com/NVIDIA/FasterTransformer/issues/234#issuecomment-1195532887

We also anlaysed batching, we find that for batch >8, the accuracy is a bit stochastic between test, whereas for batch

We built on top of the dev_5.0 branch, are there any chnages in main that is different to dev_5.0 branch. Important changes we made are 1. We get the logits...

Thanks for the quick reply! I will try merging with main and test it!