DeepCTR
DeepCTR copied to clipboard
多个varlen_sparse_feat共用同一个sparse_feat容易产生重复初始化及命名困扰问题
在inputs/create_embedding_dict方法中 如果多个varlen_sparse_feat共用同一个sparse_feat(embdding_name相同)的时候,embedding字典中同名embedding会被初始化多次,建议embedding名称使用feat.embedding_name而不是feat.name,这样不会让人产生困扰,另外embedding初始化时先判断该变量是否已经完成初始化,避免重复初始化,是不是更好?供参考
`def create_embedding_dict(sparse_feature_columns, varlen_sparse_feature_columns, seed, l2_reg, prefix='sparse_', seq_mask_zero=True): sparse_embedding = {} for feat in sparse_feature_columns: emb = Embedding(feat.vocabulary_size, feat.embedding_dim, embeddings_initializer=feat.embeddings_initializer, embeddings_regularizer=l2(l2_reg), name=prefix + 'emb' + feat.embedding_name) emb.trainable = feat.trainable sparse_embedding[feat.embedding_name] = emb
if varlen_sparse_feature_columns and len(varlen_sparse_feature_columns) > 0:
for feat in varlen_sparse_feature_columns:
# if feat.name not in sparse_embedding:
emb = Embedding(feat.vocabulary_size, feat.embedding_dim,
embeddings_initializer=feat.embeddings_initializer,
embeddings_regularizer=l2(
l2_reg),
name=prefix + '_seq_emb_' + **feat.name,**
mask_zero=seq_mask_zero)
emb.trainable = feat.trainable
sparse_embedding[feat.embedding_name] = emb
return sparse_embedding`
确实 多个 varlen_sparse_feat / sparse_feat 共用同一个 embedding_name,会被初始化多次,不过只有最后一个的vocabulary_size, feat.embedding_dim 有效