LaTeX-OCR
LaTeX-OCR copied to clipboard
beam search
code is taken and modified from huggingface transformers. Not much improvement in score, just a tiny (~0.1) positive fluctuation.
usage
switch the return function under LaTeX-OCR/pix2tex/models/utils.py
@torch.no_grad()
def generate(self, x: torch.Tensor, temperature: float = 0.25, **kwargs):
args = Munch(self.args)
args.update(kwargs)
args.temperature = temperature
# return self.decoder.beam_generate((torch.LongTensor([self.args.bos_token]*len(x))[:, None]).to(x.device), context=self.encoder(x), seq_len=self.args.max_seq_len, **args)
return self.decoder.generate((torch.LongTensor([self.args.bos_token]*len(x))[:, None]).to(x.device), context=self.encoder(x), seq_len=self.args.max_seq_len, **args)
and modify 2 hyper-parameter in config.yaml
num_beams: 3
length_penalty: 0.7
Thanks! I'll check it out. Seems to be quite slow, but that was to be expected. For some reason the method does not work super well with the model because the model is often pretty sure about its guess for the next token (I don't know why). The performance increase is only very marginal.
There definitely needs to be an option to turn this on or off in the cli
and gui
.