Rémi Louf
Results
533
comments of
Rémi Louf
I profiled the logits processing code, and the bottleneck is the transfer of the allowed token ids list (which can have many elements) to GPU. My suggestion is to use...
Could you give us the address to the forked repo?
@cpfiffer could this be (partially) solved by #1408. As discussed somewhere else, the true probabilities are obtained by summing over all the token words that correspond to each of the...