Rémi Louf

Results 533 comments of Rémi Louf

I profiled the logits processing code, and the bottleneck is the transfer of the allowed token ids list (which can have many elements) to GPU. My suggestion is to use...

Could you give us the address to the forked repo?

@cpfiffer could this be (partially) solved by #1408. As discussed somewhere else, the true probabilities are obtained by summing over all the token words that correspond to each of the...