THULAC-Python
THULAC-Python copied to clipboard
TypeError: reduce() of empty sequence with no initial value
File "/Users/xiaotaop/Documents/gitroom/recsys/similar_album/similar/similar/utils.py", line 249, in album2json_handle
data = json.dumps(album.to_ml_json())
File "/Users/xiaotaop/Documents/gitroom/recsys/similar_album/similar/similar/models.py", line 114, in to_ml_json
"content": self.segment(self.album_desc, engine="thu"),
File "/Users/xiaotaop/Documents/gitroom/recsys/similar_album/similar/similar/models.py", line 165, in segment
words = splitor.run(text)
File "/Users/xiaotaop/Documents/gitroom/recsys/similar_album/similar/similar/utils.py", line 167, in run
sentence_splited = unicode(self.splitor.cut(sentence.encode("utf-8"), True), "utf-8")
File "/Users/xiaotaop/Documents/pyenvs/dj18/lib/python2.7/site-packages/thulac/__init__.py", line 97, in cut
return self.__cutWithOutMethod(oiraw, self.__cutline, text = text)
File "/Users/xiaotaop/Documents/pyenvs/dj18/lib/python2.7/site-packages/thulac/__init__.py", line 80, in __cutWithOutMethod
txt += reduce(lambda x, y: x + ' ' + y, cut_method(line)) + '\n'
TypeError: reduce() of empty sequence with no initial value
查了下文本内容如下:
游戏名:恶灵附身 千万不要跳着看噢~精彩就在一瞬间~ 如果你笑了,记得帮我点个赞并且分享给你的小伙伴们,让我们一起把欢乐传递下去!不要忘了订阅更多欢乐等着你!
代码如下:
class THUSplit(object):
def __init__(self):
self.splitor = thulac.thulac()
def run(self, sentence):
"""
:param sentence: unicode string
:return: array of {"word": string, "flag": string}
"""
data = []
sentence_splited = unicode(self.splitor.cut(sentence.encode("utf-8"), True), "utf-8")
entries = sentence_splited.split(" ")
for entry in entries:
tmp = entry.split("_")
word = tmp[0]
flag = tmp[1]
data.append({"word": word, "flag": flag})
return data
感谢您对thulac的支持,这是pip版的bug,我们会尽快更新。您可以先下载我们github页面上的版本使用