ID-CNN-CWS
ID-CNN-CWS copied to clipboard
Source codes and corpora of paper "Iterated Dilated Convolutions for Chinese Word Segmentation"
Hi~ I have successfully run your wonderful code. But I want to know more details about embedding character (5103 vectors) What is the source of these characters and have you...
1、在进入空洞卷积之前为什么有一个普通的二维卷积?是必须的吗?意义何在? 2、一般普通的在文本上的卷积输入一本是[batch,seq_len,char_dim,1],为什么在这里这个二维卷积的输入是[bacth,1,seq_len,char_dim]?
``` File "train.py", line 41, in main logger.info('\n'.join(sorted(["%s : %s" % (str(k), str(v)) for k, v in FLAGS.__dict__['__flags'].items()]))) KeyError: '__flags' ```
ub16c9@ub16c9-gpu:/media/ub16c9/fcd84300-9270-4bbd-896a-5e04e79203b7/ub16_prj/ID-CNN-CWS$ ll result/all/001/ total 142260 drwxrwxr-x 2 ub16c9 ub16c9 4096 Jun 9 00:09 ./ drwxrwxr-x 3 ub16c9 ub16c9 4096 Dec 7 2018 ../ -rw-rw-r-- 1 ub16c9 ub16c9 73 Jun 8...
我们的实验环境 : Python 3.5 Tensorflow 1.4.0 4x Nvidia Geforce 1080Ti 对代码进行了小幅调整以便在我们的环境上运行 第一个实验测试的是Sighan2005-PKU 但是,无论BI-LSTM-CRF还是ID-CNN-CRF 每次训练完第1个迭代后,验证集、测试集都只有0.58-0.62 然后提示 'Score too low, break to save time' 退出。 同时,用同样脚本分割的语料训练 foolNLTK(https://github.com/rockyzhengwu/FoolNLTK), 几乎一样的BI-LSTM-CRF第一轮训练后验证集有0.88的F1值。 初步怀疑评测脚本有问题,验证集、测试集分数一致
tsv_to_tfrecords.py中naacl-data.tsv文件作用是什么,存放了什么数据,能给个例子吗?
after "text=preprocess(text)"; some Chinese character change to garbled. such as "充满" to "?满", “?” for garbled. Is there something wrong? I think this function is to normalize all num to...
paper
您好,感激您的分享,关于这个项目中提及的论文《Iterated Dilated Convolutions for Chinese Word Segmentation》,我在网上没有找到,请问您这边可以分享一份吗?谢谢!