zabzug-pfpt
zabzug-pfpt
I do want the character to be merged together, as an intermediate step on the way towards a larger subword. I want to retain that larger subword but remove some...
Well that's certainly disappointing to hear. Would you mind pointing me towards where in the codebase I'd need to dive in if I wanted to implement this myself?
Also, this could easily be implemented at training time. When combining subwords ABC and DEF to add subword ABCDEF, just compare the frequencies of ABCDEF, ABC, and DEF (which should...
Why is it necessary to store ABC and DEF? Those are the subwords that I want dropped from my vocabulary (that is, assuming that the only times I observe ABC...
@KevinMusgrave it looks like the `EmbeddingsAlreadyPackagedAsTriplets` miner will trivially fail for batch sizes that aren't multiples of 3: ```{python} batch_size = embeddings.size(0) a = torch.arange(0, batch_size, 3) p = torch.arange(1,...