K-NRM
K-NRM copied to clipboard
K-NRM: End-to-End Neural Ad-hoc Ranking with Kernel Pooling
Mean pooling is the average of the all words' sum, but when σ becomes infinity, the kernal's output is zero because of exp's nature. What's the correlation between them?
关于模型的小疑问
博士学姐,我最近用了你的思路实验了下用户序列预测,用户之前的点击item作为q,后续的item是否点击作为d,实验的auc达到0.6,不过训练达到0.7可能过拟合了,商品维度是100百万维,参数大约2亿多,当然比单纯的商品ctr基线要好。我现在疑问几点。1)过拟合是怎么出现的,好像有点轻微拟合。2)embeding后kernel Pooling起到特征抽取的作用,那么这次实验有效果是由于这层起到重要作用还是?好像能学习到用户这之前的点击序列上能大概知道下一时刻想要的是什么?3)在有些实验样本上表现比较差,分数区分性很小,有的序列下不同商品相关性分数都一样,这个问题?期待你的回复,先谢谢了!
If you have sample files in your machine, could you please update here. Thank you.
Hi, I have a dataset of 16000 docs and I have some queries. For each query there can be more than one relevant document. Can you tell me how can...
关于怎么处理文档
您好,对于训练数据来说: 1,2,3 \t 4,5,6 \t 7,8,9 代表一个样本的话那么1,2,3分别是种子词的对应的id, 那么4,5,6,是3篇文档的编号吗?这个编号是怎么来的,随机分配的吗? 怎么处理文档这里不太了解,是先要分词吗?就是map成id这部不太明白,如果有时间的话 解答一下吧,谢谢您!
Hi zhuyun, Please help me check this problem: https://github.com/AdeDZY/K-NRM/blob/fa5d60c38d894c3ef6cc7e580f60938773c3a8b3/knrm/data/generator.py#L93 I think when 'with_idf=True', this line should be: if len(cols) < 3: idf = np.ones(len(q)) else: idf = np.array([int(t) for t...
The KNRM embedding in http://boston.lti.cs.cmu.edu/appendices/WSDM2018-ConvKNRM/ is missing while Conv-KNRM is available. How can l download the KNRM pre-trained embeddings?
你好,打扰问下,query \t postive_document \t negative_document \t score_difference 这个训练数据如何产生呢?score_difference能否再解释一次,是什么分数的差异,我如何构造这样的样本,谢谢!