MatchZoo-py
MatchZoo-py copied to clipboard
bert processor
When i use bert processor to tranform my dataset, there will appear a warning:
Token indices sequence length is longer than the specified maximum sequence length for this model (694 > 512). Running this sequence through the model will result in indexing errors.
But my dataset don't have so long sequence! And it will lead to an error when training. Do you know how to solve it? Thanks! Matchzoo version 1.1.1
BERT has max sequence token length less than 512. I think you have token length greater than 512. Check the token length. Quick solution could be: if you have paragraph of 800, split into two 400 paragraph length and then tokenize.
你好,我想问下,当我使用bert训练模型时候,在最后trainer.run()这里,老是报“Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)”这个错,一直没找出到底该改动哪里,能否给解决一下啊 非常非常感谢~~
你好,我想问下,当我使用bert训练模型时候,在最后trainer.run()这里,老是报“Expected tensor for argument #1 'indices' to have scalar type Long; but got torch.cuda.IntTensor instead (while checking arguments for embedding)”这个错,一直没找出到底该改动哪里,能否给解决一下啊 非常非常感谢~~
Please provide more details, e.g. code snippets
就是在https://github.com/NTMC-Community/MatchZoo-py/blob/master/tutorials/ranking/bert.ipynb中运行到最后一步trainer.run()报错了。求教 感谢感谢!