wikIR icon indicating copy to clipboard operation
wikIR copied to clipboard

len_doc and encoding

Open TheAzouz opened this issue 3 years ago • 0 comments

Hello,

I would like to point out two issues I faced when working with wikIR tool:

  1. There is a mistake in the documentation for the len_doc parameter. It says that by default it's equal to None (all tokens are collected) while in the code is 200. To get all tokens I used --len_doc -1
  2. It would be good if we can specify the encoding of the input file and output file.

TheAzouz avatar Jun 24 '21 12:06 TheAzouz