Traceback (most recent call last):

File "", line 1, in runfile('E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj/main.py', wdir='E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj')

File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile execfile(filename, namespace)

File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile exec(compile(f.read(), filename, 'exec'), namespace)

File "E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj/main.py", line 246, in train()

File "E:/【重点代码】ChineseNER-master-bishe/Gradu_Prj/main.py", line 192, in train step, batch_loss = model.run_step(sess, True, batch)

File "E:\【重点代码】ChineseNER-master-bishe\Gradu_Prj\model.py", line 221, in run_step feed_dict)

File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 895, in run run_metadata_ptr)

File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1097, in _run np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)

File "E:\anaconda INSTALL\envs\tensorflow\lib\site-packages\numpy\core\numeric.py", line 492, in asarray return array(a, dtype, copy=False, order=order)

ValueError: setting an array element with a sequence.

将example.train， example.test， example.dev三个文件中的句子删除一部分后，转变成txt文档保存，但运行时出错。

Mar 22 '18 14:03 SanSLee

您好，我在运行的时候也报了这个错误，不过我没有改动数据文件。请问您这个问题解决了吗？

Jun 22 '18 06:06 bearchj

请问您这个问题解决了吗？

Nov 09 '18 09:11 amoursmile

我也是用了比较少的数据集，请问您问题解决了吗？

Dec 16 '18 08:12 Jenny181212

想请教一下，这个错误，能否解决一下

Apr 03 '19 06:04 agilelab

想请教一下，这个错误，能否解决一下

您好，我在运行的时候也报了这个错误，不过我没有改动数据文件。请问您这个问题解决了吗？

我也是用了比较少的数据集，请问您问题解决了吗？

请位三位，这个问题是如何解决的

Apr 03 '19 06:04 agilelab

主要是因为数据集标注格式错了，windows下的换行是\r\n把它换成\n就行，还有中间的空格写入\t。

Apr 15 '19 00:04 wakanow

可以说得详细一点吗？新手小白 @wakanow

Apr 25 '19 07:04 mz2sj

@agilelab 请问你解决了吗

Apr 25 '19 07:04 mz2sj

此问题经过仔细跟踪检查，发现是在loader.py代码之中的prepare_dataset函数之中产生，不确定是什么原因导致输出的四个元组长度不一致，貌似原因是jieba分词的时候，小概率把比如10个字，经分词，分词总长度超过了10，猜测10个字符之中带了一个特殊字符，但是没有找到，所以我加了判断代码，修改后如下：

def prepare_dataset(sentences, char_to_id, tag_to_id, lower=False, train=True): """ Prepare the dataset. Return a list of lists of dictionaries containing: - word indexes - word char indexes - tag indexes 返回的data=[[句，句中字在训练字映射字典中的id，句中分词位置list，句中字在训练数据标注映射字典中的id],......] """

none_index = tag_to_id["O"]

def f(x):
    return x.lower() if lower else x
data = []
for s in sentences:
    string = [w[0] for w in s]
    chars = [char_to_id[f(w) if f(w) in char_to_id else '<UNK>']
             for w in string]
    segs = get_seg_features("".join(string))
    if train:
        tags = [tag_to_id[w[-1]] for w in s]
    else:
        tags = [none_index for _ in chars]

    # 返回的四个列表如果不能对齐，即如果列表长度不一到，抛弃掉 JAMES 2019-04-03
    if len(string) == len(chars) == len(segs) == len(tags):
        data.append([string, chars, segs, tags])
    else:
        st = "".join(string)
        print("句子:[{0}]标注数据错误".format(st))

return data

Apr 30 '19 02:04 agilelab

此问题经过仔细跟踪检查，发现是在loader.py代码之中的prepare_dataset函数之中产生，不确定是什么原因导致输出的四个元组长度不一致，貌似原因是jieba分词的时候，小概率把比如10个字，经分词，分词总长度超过了10，猜测10个字符之中带了一个特殊字符，但是没有找到，所以我加了判断代码，修改后如下：

def prepare_dataset(sentences, char_to_id, tag_to_id, lower=False, train=True): """ Prepare the dataset. Return a list of lists of dictionaries containing:

word indexes
word char indexes
tag indexes 返回的data=[[句，句中字在训练字映射字典中的id，句中分词位置list，句中字在训练数据标注映射字典中的id],......] """ none_index = tag_to_id["O"] def f(x): return x.lower() if lower else x data = [] for s in sentences: string = [w[0] for w in s] chars = [char_to_id[f(w) if f(w) in char_to_id else '<UNK>'] for w in string] segs = get_seg_features("".join(string)) if train: tags = [tag_to_id[w[-1]] for w in s] else: tags = [none_index for _ in chars] # 返回的四个列表如果不能对齐，即如果列表长度不一到，抛弃掉 JAMES 2019-04-03 if len(string) == len(chars) == len(segs) == len(tags): data.append([string, chars, segs, tags]) else: st = "".join(string) print("句子:[{0}]标注数据错误".format(st)) return data

天行健，君子当自强不息

------------------ 原始邮件 ------------------ 发件人: "mzsj"[email protected]; 发送时间: 2019年4月25日(星期四) 下午3:56 收件人: "zjy-ucas/ChineseNER"[email protected]; 抄送: "47920381"[email protected]; "Mention"[email protected]; 主题: Re: [zjy-ucas/ChineseNER] 减小数据集后，报错：ValueError: setting an array element with a sequence. (#30)

@agilelab 请问你解决了吗

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Apr 30 '19 02:04 agilelab

谢谢 @agilelab

May 02 '19 07:05 mz2sj

ChineseNER ChineseNER copied to clipboard

减小数据集后，报错：ValueError: setting an array element with a sequence.

ValueError: setting an array element with a sequence.

ChineseNER
ChineseNER copied to clipboard