zhaoxf4
zhaoxf4
> 。。。。为啥要改分切符号。。。不可以反向切割吗?从右边切割两刀这不就避开了word里的空格了吗?如下: > > ``` > word, _, tag = line.rsplit(' ', 2) > ``` 不能反向切割,反向切割的前提条件是字典里每一条都遵循标准格式“词 词频 词性”,假设我自定义的词典里只有词,没有词性和词频,然后我的词还带有空格,反向切割会把我的词直接切开。
> > > 。。。。为啥要改分切符号。。。不可以反向切割吗?从右边切割两刀这不就避开了word里的空格了吗?如下: > > > ``` > > > word, _, tag = line.rsplit(' ', 2) > > > ``` > > > > > > 不能反向切割,反向切割的前提条件是字典里每一条都遵循标准格式“词 词频...
> > > > 。。。。为啥要改分切符号。。。不可以反向切割吗?从右边切割两刀这不就避开了word里的空格了吗?如下: > > > > ``` > > > > word, _, tag = line.rsplit(' ', 2) > > > > ``` > > > >...
> 刚刚才发现,如果自定义词典中只有词(带空格和数字),那么只需要补足词频、词性就可以了。 举个例子,"MI 平板 5" 补足为 "MI 平板 5 1 none",然后jieba现在的规则直接就匹配到后面的了,前面的词会完整抽出来(前提是只用一个空格分割)。 只改一下re_userdict就行了,要是没别的问题,我以后就直接这么干了。。
@juntaoy OK, I think I have replicated it on eng_conll2003 approximately. the six evaluating results on six re-train models (windows_size=511, train+dev for train 80000 steps) are as follow: | index\time...
@wangxinyu0922 if you just need eng_conll2003, you can use my script mentioned in [this issue](https://github.com/juntaoy/biaffine-ner/issues/16) though it's not elegant.
@juntaoy sorry to bother you, I find just a few part of ace2004 have end flags like "( End )" and "---". Did you have other flags to divide it?
@juntaoy thank you very much! amazing response speed!
#173 #207 #99