DeepCTR icon indicating copy to clipboard operation
DeepCTR copied to clipboard

多个varlen_sparse_feat共用同一个sparse_feat容易产生重复初始化及命名困扰问题

Open haiming2019 opened this issue 3 years ago • 1 comments

在inputs/create_embedding_dict方法中 如果多个varlen_sparse_feat共用同一个sparse_feat(embdding_name相同)的时候,embedding字典中同名embedding会被初始化多次,建议embedding名称使用feat.embedding_name而不是feat.name,这样不会让人产生困扰,另外embedding初始化时先判断该变量是否已经完成初始化,避免重复初始化,是不是更好?供参考

`def create_embedding_dict(sparse_feature_columns, varlen_sparse_feature_columns, seed, l2_reg, prefix='sparse_', seq_mask_zero=True): sparse_embedding = {} for feat in sparse_feature_columns: emb = Embedding(feat.vocabulary_size, feat.embedding_dim, embeddings_initializer=feat.embeddings_initializer, embeddings_regularizer=l2(l2_reg), name=prefix + 'emb' + feat.embedding_name) emb.trainable = feat.trainable sparse_embedding[feat.embedding_name] = emb

if varlen_sparse_feature_columns and len(varlen_sparse_feature_columns) > 0:
    for feat in varlen_sparse_feature_columns:
        # if feat.name not in sparse_embedding:
        emb = Embedding(feat.vocabulary_size, feat.embedding_dim,
                        embeddings_initializer=feat.embeddings_initializer,
                        embeddings_regularizer=l2(
                            l2_reg),
                        name=prefix + '_seq_emb_' + **feat.name,**
                        mask_zero=seq_mask_zero)
        emb.trainable = feat.trainable
        sparse_embedding[feat.embedding_name] = emb
return sparse_embedding`

haiming2019 avatar Oct 11 '21 06:10 haiming2019

确实 多个 varlen_sparse_feat / sparse_feat 共用同一个 embedding_name,会被初始化多次,不过只有最后一个的vocabulary_size, feat.embedding_dim 有效

jackyhawk avatar Sep 07 '22 02:09 jackyhawk