depccg icon indicating copy to clipboard operation
depccg copied to clipboard

Customising tokenization

Open itlchriss opened this issue 1 year ago • 0 comments

Hello author,

Greetings. I found there is a config_en.jsonnet, which contains several en.jsonnet files specifying lots of tokens and ccg rules. May I know that,

  1. If I want to customise the tokenizer, after modifying these files, do I need to retrain the model?
  2. Does the number of tokens in tokens.en.jsonnet have any relationship with the number of targets in the targets.en.json?

Thanks and Best Regards, Chriss IT. Leong

itlchriss avatar Sep 07 '22 07:09 itlchriss