ConvLab-2 icon indicating copy to clipboard operation
ConvLab-2 copied to clipboard

spacy tokenizer

Open IreneSucameli opened this issue 2 years ago • 1 comments

Hi, the spacy tokenizer in the NLU module (specifically, jointBERT) is downloaded every time the script is launched? Thus, the module downloads the latest version of the tokenizer? Or the version used is fixed? Thanks.

IreneSucameli avatar May 03 '22 14:05 IreneSucameli

the spacy tokenizer in the NLU module (specifically, jointBERT) is downloaded every time the script is launched?

No, I think only the first time. Once downloaded, self.nlp = spacy.load("en_core_web_sm") will not raise error

https://github.com/thu-coai/ConvLab-2/blob/ad32b76022fa29cbc2f24cbefbb855b60492985e/convlab2/nlu/jointBERT/multiwoz/nlu.py#L60-L67

zqwerty avatar May 05 '22 01:05 zqwerty