Maxime De Bruyn
Maxime De Bruyn
Any timing available on this ?
Indeed! An easy fix would be to remove the step size `self.batch_size`?
Hi Carlos, > Thanks for integrating SPLADE using CSR matrices! I will be running it on my side and will let you know if it matches the numbers we have...
Really cool stuff! The multi-gpu encoding is a super cool feature :-)
Dis you try reducing the model size ?
See Annex C > In the cross-attention module, inputs are first processed with layer norm (Ba et al., 2016) before being passed through linear layers to produce each of the...