TextClassify
TextClassify copied to clipboard
中文文本分类器,训练简单,多种模型可选.
1. debug发现test_predict的值全是'G_\xc2\xc3\xd3\xce\xbe\xb0\xb5\xe3',全部分为了G类,求教为什么? 2.而运行demo_tfidf.py 运行没问题,结果为accuracy is 0.912500,正好是你readme中使用demo_bow.py得到的结果,求教是不是哪里出错了? 3. 如何让demo_tfidf.py 也可以保存分类器模型,每次只预测新样本,省略每次训练时间。 ps:环境python2.7 ,期待您的回复
ValueError: Expected 2D array, got 1D array instead: array=[ 0. 0. 0. ..., 0. 0. 0.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature...
for name in self.filenames: with open(join(name), 'r', 1000, 'utf-8') as f: tf_dict = dict() > > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb1 in position 0: invalid > > start...
你好,请问这里的dictionary.pkl是怎么得到的啊?