zhujiem issues

Results 15 issues of


zhujiem

Recall指标对齐

您好，我看到ENMF与LightGCN 和NBPO等方法，但我发现ENMF代码中recall指标的计算与其他几种方法没有对齐。我想确定您给定的ENMF结果是使用下面第一种还是第二种的结果？第一种：ENMF中使用的是len(hit_items) / min(topk, len(ground_truth)) 第二种使用的是 len(hit_items) / len(ground_truth)，如下 NBPO: https://github.com/Wenhui-Yu/NBPO/blob/master/Library.py#L14 LightGCN: https://github.com/kuandeng/LightGCN/blob/master/evaluator/python/evaluate_foldout.py#L20

Is it possible to install graphvite on a CPU machine?

The installation steps show how to install graphvite with cuda. Can I install it to my laptop for a demo? Thanks!

enhancement

Is the dataset partition same with the one for NGCF repo?

I want to play with the code to reproduce the results and make some comparisons. I found NGCF repo lacks Yelp2018, which is available here. I wonder whether the train/test...

FuxiCTR v2.0 updates

To update: DESTINE InterHAt EDCN MaskNet DLRM DSSM

Bug found in `Ali_Display_Ad_Click` dataset preprocessing, which has been used for DMR model reproduction.

数据集目录“Ali_Display_Ad_Click”中显示从以下路径直接获取预处理后的数据https://github.com/PaddlePaddle/PaddleRec/blob/master/datasets/Ali_Display_Ad_Click/run.sh#L3 ``` wget https://paddlerec.bj.bcebos.com/datasets/dmr/dataset_full.zip ``` 但该预处理数据的ID编码存在问题，具体表现为: 编码之后test set中仍包含未在train set中出现过的ID，可能原因为编码词典的统计不是只在train set中进行，导致test中出现的的新ID也在字典中。从而导致训练模型过程中，feature embedding的数量要比真实的要大，test阶段未训练到的ID embedding会以随机值的形式出现，会导致模型效果偏低。以brand为例，统计brand_his和brand两个字段(这两个字段是统一编码)，具体复现代码： ``` # 字段说明参看https://aistudio.baidu.com/aistudio/projectdetail/1805731 中“生成最终训练和测试数据集”标签页 train = pd.read_csv("work/train_sorted.csv", dtype=object) train.fillna("0", inplace=True) brand = train.iloc[:, 263].astype(int).values brand_set = set(list(brand))...

zhujiem

Recall指标对齐

Is it possible to install graphvite on a CPU machine?

Is the dataset partition same with the one for NGCF repo?

FuxiCTR v2.0 updates

Bug found in `Ali_Display_Ad_Click` dataset preprocessing, which has been used for DMR model reproduction.

对于有专利保护的第三方算法比如Wide&Deep，实现并开源会侵犯知识产权吗？

Can you please share your scripts for data partitions?

请问Movielens使用的哪个版本呀？

README更新

Data splits available?