text_classifier_tf2 icon indicating copy to clipboard operation
text_classifier_tf2 copied to clipboard

训练TextCNN分类器时使用word2vec词向量做特征增强,会出现以下错误。该错误发生在我重新训练了一次word2vec向量之后

Open uniqueLL opened this issue 2 years ago • 4 comments

Traceback (most recent call last): File "F:/Citation/code/ww/text_classifier_tf2-master/main.py", line 55, in train.train() File "F:\Citation\code\ww\text_classifier_tf2-master\engines\train.py", line 127, in train train_dataset = self.data_manager.get_dataset(train_df) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 252, in get_dataset X, y = self.prepare_w2v_data(df['sentence'], df['label']) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 197, in prepare_w2v_data tokens = self.tokenizer_for_sentences(sentence) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 183, in tokenizer_for_sentences if token in self.token2id: AttributeError: 'DataManager' object has no attribute 'token2id' WARNING:tensorflow:Unresolved object in checkpoint: (root).model.embedding.embeddings WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv1.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv1.bias WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv2.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv2.bias WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv3.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv3.bias WARNING:tensorflow:Unresolved object in checkpoint: (root).model.dense.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.dense.bias WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

uniqueLL avatar Jun 23 '22 01:06 uniqueLL

Traceback (most recent call last): File "F:/Citation/code/ww/text_classifier_tf2-master/main.py", line 55, in train.train() File "F:\Citation\code\ww\text_classifier_tf2-master\engines\train.py", line 127, in train train_dataset = self.data_manager.get_dataset(train_df) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 252, in get_dataset X, y = self.prepare_w2v_data(df['sentence'], df['label']) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 197, in prepare_w2v_data tokens = self.tokenizer_for_sentences(sentence) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 183, in tokenizer_for_sentences if token in self.token2id: AttributeError: 'DataManager' object has no attribute 'token2id' WARNING:tensorflow:Unresolved object in checkpoint: (root).model.embedding.embeddings WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv1.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv1.bias WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv2.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv2.bias WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv3.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.conv3.bias WARNING:tensorflow:Unresolved object in checkpoint: (root).model.dense.kernel WARNING:tensorflow:Unresolved object in checkpoint: (root).model.dense.bias WARNING:tensorflow:A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.

重新训练前删掉训练好的本地文件

stanleylsx avatar Jun 23 '22 02:06 stanleylsx

Traceback (most recent call last): File "F:/Citation/code/ww/text_classifier_tf2-master/main.py", line 55, in train.train() File "F:\Citation\code\ww\text_classifier_tf2-master\engines\train.py", line 127, in train train_dataset = self.data_manager.get_dataset(train_df) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 252, in get_dataset X, y = self.prepare_w2v_data(df['sentence'], df['label']) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 197, in prepare_w2v_data tokens = self.tokenizer_for_sentences(sentence) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 183, in tokenizer_for_sentences if token in self.token2id: AttributeError: 'DataManager' object has no attribute 'token2id'

我重新下载了一份没有经过训练的代码,先成功训练word2vec,之后训练分类器(textcnn)时同样出现这个问题,但是没有之前的警告信息,我发现如果不重新训练word2vec,直接使用word2vec词向量做特征增强是也会出现这个问题

uniqueLL avatar Jun 23 '22 04:06 uniqueLL

Traceback (most recent call last): File "F:/Citation/code/ww/text_classifier_tf2-master/main.py", line 55, in train.train() File "F:\Citation\code\ww\text_classifier_tf2-master\engines\train.py", line 127, in train train_dataset = self.data_manager.get_dataset(train_df) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 252, in get_dataset X, y = self.prepare_w2v_data(df['sentence'], df['label']) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 197, in prepare_w2v_data tokens = self.tokenizer_for_sentences(sentence) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 183, in tokenizer_for_sentences if token in self.token2id: AttributeError: 'DataManager' object has no attribute 'token2id'

我重新下载了一份没有经过训练的代码,先成功训练word2vec,之后训练分类器(textcnn)时同样出现这个问题,但是没有之前的警告信息,我发现如果不重新训练word2vec,直接使用word2vec词向量做特征增强是也会出现这个问题

好的 我有时间看一下

stanleylsx avatar Jun 23 '22 15:06 stanleylsx

Traceback (most recent call last): File "F:/Citation/code/ww/text_classifier_tf2-master/main.py", line 55, in train.train() File "F:\Citation\code\ww\text_classifier_tf2-master\engines\train.py", line 127, in train train_dataset = self.data_manager.get_dataset(train_df) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 252, in get_dataset X, y = self.prepare_w2v_data(df['sentence'], df['label']) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 197, in prepare_w2v_data tokens = self.tokenizer_for_sentences(sentence) File "F:\Citation\code\ww\text_classifier_tf2-master\engines\data.py", line 183, in tokenizer_for_sentences if token in self.token2id: AttributeError: 'DataManager' object has no attribute 'token2id'

我重新下载了一份没有经过训练的代码,先成功训练word2vec,之后训练分类器(textcnn)时同样出现这个问题,但是没有之前的警告信息,我发现如果不重新训练word2vec,直接使用word2vec词向量做特征增强是也会出现这个问题

请问这个问题解决了吗?

Crispinli avatar Jul 17 '22 10:07 Crispinli