AliBug
AliBug
**Describe the bug** [ZH Number] "一千瓦"、"两千赫"、"三千焦" etc. value are null while "十千瓦"、"两百千赫"、"三万千焦" can be recognized **To Reproduce** `NumberRecognizer.RecognizeNumber(query, Culture.Chinese)` **Expected behavior** _一千瓦_ ``` { "Text": "一", "Start": 0, "End": 0,...
**Describe the bug** "三十几万元"、"二百余万元"、"一千多万元" will be misinterpreted while "三十几元"、"二百余元"、"一千多元" can't be recognized. **To Reproduce** 三十几万元 ≠ 三十万元 ``` { "Text": "三十几万元", "Start": 0, "End": 4, "TypeName": "currency", "Resolution": { "isoCurrency":...
**Describe the bug** "十来"、"二十来"、"十来万"、"二百三十来万"、"肆拾來億"、"500來億" etc . in Chinese mean "more than X" **Expected input/output** ``` { "Text": "十来", "Start": 0, "End": 1, "TypeName": "numberrange", "Resolution": { "value": "(10,)" } }...
**Describe the bug** "好几十"、"好几百"、"好几千"、"好几万"、"好几百万"、"好几千万"、"好幾億" …… starts with "好几" etc. should be recognized as numberrange "数十"、"数百"、"数千"、"数万"、"数十万"、"数百万"、"数千万"、"数万亿" …… starts with "数" etc. should be recognized as numberrange in Chinese they mean "2...
**Describe the bug** "上百"、"上千"、"上万"、"上百万"、"上億"、"上萬億" …… starts with "上" etc. should be recognized as numberrange in Chinese they mean "more than base number and less than 10* base number" **Expected behavior**...
**Describe the bug** amr模型: MRP2020_AMR_ZHO_MENGZI_BASE 例1 输入 ["我", "不", "吃饭"] 执行结果中, 吃饭 对应的 "anchors": [] ``` { "id": "0", "input": "我 不 吃饭", "nodes": [ { "id": 0, "label": "我",...
**Describe the bug** 例1:我给了他15万元。 amr 解析结果如下图:  “**15万**” 未被正确解析 --- 例2: 我给了他十五点八万元。  “**十五点八万**” 未被正确解析 --- 例3: 我给了他十元三角八分钱。  “**十元三角八分**” 未被正确解析 **Code to reproduce the issue**...
[https://www.paddlepaddle.org.cn/hubdetail?name=lac&en_category=LexicalAnalysis](url) 
--- **以下是错分的** 他在比赛中共输了两次。 ——> 中共 中国在奥运会中共获得多少金牌? ——> 中共 你看过多少本书? ——> 本书 每个词限用一次。 ——> 词限 ~~战败的一方向对方请求停战。 ——> 方向~~ 我怎么吃得下饭? ——> 下饭 战士们骑着马来了。 ——> 马来 从一数到十 ——> 一数 ~~这有两大车货 ——> 两大~~ ~~性交通常指男女之间发生性行为...
丐帮帮主张三丰9月9日主持君山大会。  这分词结果 和 hanlp1.7.x 一模一样 唉😖  比较hanlp2.x 的 