DialoFlow
DialoFlow copied to clipboard
about data tokenizer
hello, i see tokenizer seq in paper is : [u1] [C] [u2] [C] [res] [C]
but tokenizer in code dataset is : [speaker1] [u1] [eos] [speaker2] [u2] [eos] [bos] [res] [eos]
Is there any difference between the two? which works best