fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

AttributeError: module 'sacrebleu.tokenizers' has no attribute 'tokenizer_zh'

Open passionate11 opened this issue 1 year ago • 0 comments

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

  1. I have finetuned my m2m100 model, but at the time i tring to process the generated data using the cmd given by https://github.com/facebookresearch/fairseq/tree/main/examples/m2m_100 , cat ${fairseq}/gen_out | grep -P "^H" | sort -V | cut -f 3- | sh tok.sh fr > hyp
  2. It occurs AttributeError: module 'sacrebleu.tokenizers' has no attribute 'tokenizer_zh', I have updated my sacrebleu to 2.3.0 which is same as the https://github.com/mjpost/sacrebleu, and I do can find the tokenizer_zh in it.
  3. And if i rewrite the /fairseq/examples/m2m_100/tokenizers/tokenize_zh.py to import fileinput from sacrebleu.tokenizers.tokenizer_zh import TokenizerZh tokenizer = TokenizerZh() for line in fileinput.input(): print(tokenizer(line)) the problem can be fixed, but i still wonder am I right? Or there is something wrong with my operation. Appriciate your help!!

passionate11 avatar Apr 07 '23 02:04 passionate11