shuDaoNan9

Results 11 issues of shuDaoNan9

If I do not set numBatches, there will be ‘NegativeArraySizeException’ or ‘OOM’ during trainning big dataset (about 26320507 rows), and the cpu utilization will be less than 90%. **But if...

bug
area/lightgbm

First, I tried **2 spark slaves**, it take about 11 minutes to train my model. submit info: spark-submit --master yarn **--num-executors 2** --executor-memory 19G --executor-cores 16 --conf spark.dynamicAllocation.enabled=false --jars s3://EMR/jars/synapseml-vw_2.12-0.9.4.jar,s3://EMR/jars/synapseml_2.12-0.9.4.jar,s3://EMR/jars/client-sdk-1.14.0.jar...

area/lightgbm

大数据场景下全量加载到显卡训练显然不现实,鄙人在TF2.4.0中试了下将数据处理好后分批给到DeepMatch模型,发现有以下问题: **1. 通过tf.data.Dataset.from_tensor_slices将数据分批后训练,却发现在tf.compat.v1.disable_eager_execution()后无法使用,报错如下:** D:\Anaconda3\envs\TF2GPU\lib\site-packages\tensorflow\python\keras\backend.py:434: UserWarning: `tf.keras.backend.set_learning_phase` is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the `training` argument of the `__call__` method...

question

只能导出h5格式模型,pb格式无法导出: 导出模型code如下: tf.saved_model.save(model, outputDir + 'YouTubeNet_model2') 或: from tensorflow.python.keras.models import Model, load_model, save_model save_model(model, 'YouTubeNet_model.pb',save_format='tf') 报错如下: Traceback (most recent call last): File "F:/python/DeepMatch-master/examples/**run_youtubednn**.py", line 70, in tf.saved_model.save(model, outputDir + 'YouTubeNet_model2')...

I just want to add some photos to the training set

运行DCN模型跑下面这个数据集时候有些疑问: http://labs.criteo.com/2014/02/download-kaggle-display-advertising-challenge-dataset/ Kaggle Display Advertising Challenge Dataset 我看里面数据格式是: The columns are tab separeted with the following schema: ... ... 并没有区分用户id、商品id,那这样如何给用户做推荐呢?而且我看get_criteo_feature.py处理的时候,很多categorical 类型数据直接被截断没了,那如何区分开用户呢? parser.add_argument( "--cutoff", type=int, default=200, help="cutoff long-tailed categorical values" )...

1. 请问DCN模型的代码不能直接用于criteo数据集吗?还是要运行的时候传哪几个参数? 2. 我看代码里面默认field_size是0,这里必须要在运行时候传参吧,比如我的是field_size=2496 ?不传参的话“ feat_vals = tf.reshape(feat_vals, shape=[-1, field_size, 1])”这里reshape成(-1,0)就报错了: Reshape cannot infer unless all specified input sizes are non-zero。 但是传参的话,后面feature_size代码里面默认也是0,且没有计算新值赋值,导致变成Feat_Emb维度是(0, 32),然后又引起 embeddings = tf.nn.embedding_lookup(Feat_Emb, feat_ids) # None *...

建议可以加个readme 比如get_criteo_feature.py默认测试集的所有特征都在训练集出现过,否则feature_map不全; 比如测试的数据不能太少,不然cutoff都没了; 比如测试集这里跟训练集这里下标差一:val = dists.gen(i, features[continous_features[i] - 1]),然后我改成跟训练集一样的下标了,应该是我的数据格式测试集合训练集是一样的,博主的两者数据坐标差一? 测试集的label = features[0]我也加上去了,这样后面对比测试效果应该能更加方便对比,不然延用训练集的最后一个label感觉怪怪的; 比如数值型连续值不能只有一个唯一值,否则归一化出错; ...........

我看YouTubeNet原本的召回阶段使用的是softmax分类,是因为认为这个应该被理解为多分类问题,可以“同时”选择多个“下个”要播放的视频吗? https://github.com/yangxudong/deeplearning/blob/master/youtube_match_model/youtube_match_model.py loss = tf.nn.sigmoid_cross_entropy_with_logits( labels=labels_one_hot, logits=logits) 谢谢!

I find FMClassifier in spark3, but I don't know what formate my featuresCol should be. I used my gbdt feature, but the result AUC is bad. some of my code:...