youngornever
youngornever
after "text=preprocess(text)"; some Chinese character change to garbled. such as "充满" to "?满", “?” for garbled. Is there something wrong? I think this function is to normalize all num to...
I want to know where does the new_data come from? Can I use it in my paper? Moreover, I find new_data/tst2012 and new_data/tst2013 have the same content.
There is UnicodeDecodeError when I run the segment.py; Actually, I find this error is caused by the data and the code is ok. For example, see the line 87. And...