lambdaji comments

Results 26 comments of


                                            lambdaji

连续特征和离散特征 embedding会有重合第问题？

取100条数据跑跑看

https://github.com/lambdaji/tf_repos/tree/master/DeepMTL/Feature_pipeline

#step1 log to libsvm sample sh get_join_sample.sh #step2 stat sample & feature（可以跳过） sh get_stat_feat.sh #step3 remap feat_id（去掉低频特征，可以跳过） sh get_remap_fid.sh #step4 libsvm to tfrecords python get_tfrecord.py --threads=10 --input_dir=./ --output_dir=./

tf distribute

run_dist.sh？

tf distribute

启动脚本发来看看

DCN能否区分下 continous feature 和categorical feature 呢~

看下readme 特征框架part

DCN模型：field_size、feature_size等参数导致的reshape、embedding_lookup等问题

run.sh 19行

DeepFM效果比FNN差

这个问题不太好定位具体的原因。在训练样本相同的情况下，至少还有两个因素值得探讨： #1 模型capacity 模型的学习/表达能力，能学多少 #2 trainability 可训练性，这是一个优化问题，能不能找到一个“好”的解；很多模型架构的表达能力都是同等的，性能上的差异都是由于某些结构比其他架构更容易优化导致的。你说的理论上更高，应该指的是DeepFM的表达能力更强；但是可训练性问题也是客观存在的，比如调参。当然，也不排除代码有bug。 PS：有同学反馈FM part输出K为向量与Deep-part concat起来送入输出层有进一步提升。

lambdaji

连续特征和离散特征 embedding会有重合第问题？

https://github.com/lambdaji/tf_repos/tree/master/DeepMTL/Feature_pipeline

tf distribute

tf distribute

DCN能否区分下 continous feature 和categorical feature 呢~

DCN模型：field_size、feature_size等参数导致的reshape、embedding_lookup等问题

DeepFM效果比FNN差

RuntimeError: There was no new checkpoint after the training. Eval status: no new checkpoint

DeepFM gpu利用率问题

如何从原数据获取DIN数据