cocoNLP icon indicating copy to clipboard operation
cocoNLP copied to clipboard

extract_time的bug

Open Issacwww opened this issue 6 years ago • 0 comments

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-182311dacec4> in <module>()
----> 1 model.predict(strs[1])

~/jupyter_projects/NERPredicter.py in predict(self, sentence, attrs)
     70         self.dic['ID'] = self.cocoExtractor.extract_ids(sentence)
     71         self.dic['PHONE_INFO'] = [self.cocoExtractor.extract_cellphone_location(cell,'CHN') for cell in self.dic['PHONE']]
---> 72         time = json.loads(self.cocoExtractor.extract_time(sentence))
     73         if time["type"] == "timestamp":
     74             self.dic['DATETIME'].append(time["timestamp"])

~/anaconda3/lib/python3.7/site-packages/cocoNLP/extractor.py in extract_time(self, text)
    238         tmp_text = self.replace_ids(tmp_text)
    239         tn = TimeNormalizer()
--> 240         res = tn.parse(target=tmp_text)  # target为待分析语句,timeBase为基准时间默认是当前时间
    241         return res
    242 

~/anaconda3/lib/python3.7/site-packages/cocoNLP/config/basic/time_nlp/TimeNormalizer.py in parse(self, target, timeBase)
     87         self.invalidSpan = False
     88         self.timeSpan = ''
---> 89         self.target = self._filter(target)
     90         self.timeBase = arrow.get(timeBase).format('YYYY-M-D-H-m-s')
     91         self.nowTime = timeBase

~/anaconda3/lib/python3.7/site-packages/cocoNLP/config/basic/time_nlp/TimeNormalizer.py in _filter(self, input_query)
     25     def _filter(self, input_query):
     26         # 这里对于下个周末这种做转化 把个给移除掉
---> 27         input_query = StringPreHandler.numberTranslator(input_query)
     28 
     29         rule = u"[0-9]月[0-9]"

~/anaconda3/lib/python3.7/site-packages/cocoNLP/config/basic/time_nlp/StringPreHandler.py in numberTranslator(cls, target)
    134             s = filter(None, s)
    135             num = 0
--> 136             if len(s) == 1:
    137                 tenthousand = int(s[0])
    138                 num += tenthousand * 10000

TypeError: object of type 'filter' has no len()

输入测试的句子为: 1月24日,新华社对外发布了中央对雄安新区的指导意见,洋洋洒洒2万字,17次提到北京,4次提到天津,信息量很大,其实也回答了人们关心的很多问题。 将 2万 修改为 数万 后程序正常, 查看了一下源码,认为可以去掉filter

Issacwww avatar Jun 12 '19 08:06 Issacwww