recommendation_papers
recommendation_papers copied to clipboard
paper collection for recommendation aspet
personal blog: http://litowang.top/
recommendation_papers
- [x] means validation is valid, * means promissing
[一] 模型结构
CTR/CVR 通用模型结构
- [x] Factorization Machines
- [x] Field-aware Factorization Machines for CTR Prediction
- [x] Field-weighted Factorization Machines for Click-Through RatePrediction in Display Advertising
- [ ] AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
- [ ] FiBiNET: Combining Feature Importance and Bilinear feature Interaction for Click-Through Rate Prediction
- [ ] CFM: Convolutional Factorization Machines for Context-Aware Recommendation
- [x] Field-aware Neural Factorization Machine for Click-Through Rate Prediction
- [x] Holographic Factorization Machines for Recommendation -> note
- [ ] Cross and Deep network for Ad Click Predictions
- [ ] Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features
- [ ] xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems
- [ ] Product-based Neural Networks for User Response Prediction over Multi-field Categorical Data
- [ ] Product-based Neural Networks for User Response Prediction
- [ ] DeepGBM: A Deep Learning Framework Distilled by GBDT for Online Prediction Tasks
- [ ] Online Deep Learning: Learning Deep Neural Networks on the Fly
- [ ] InteractionNN: A Neural Network for Learning Hidden Features in Sparse Prediction
- [ ] High-order Factorization Machine Based on Cross Weights Network for Recommendation
- [ ] Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions
- [ ] Quaternion Collaborative Filtering for Recommendation
- [ ] * Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction
- [ ] Exploring Content-based Video Relevance for Video Click-Through Rate Prediction
- [ ] DGFFM: Generalized Field-aware Factorization Machine based on DenseNet
- [ ] LMLFM: Longitudinal Multi-Level Factorization Machines
- [ ] *Sequence-Aware Factorization Machines for Temporal Predictive Analytics
- [x] FLEN: Leveraging Field for Scalable CTR Prediction
- [ ] Beyond Similarity: Relation Embedding with Dual Attentions for Item-based Recommendation
- [ ] *Learning Feature Interactions with Lorentzian Factorization Machine
- [ ] Learning to Recommend via Meta Parameter Partition
- [ ] Online continual learning with no task boundaries
- [ ] Solving Cold Start Problem in Recommendation with Attribute Graph Neural Networks
- [ ] Mixed Dimension Embedding with Application to Memory-Efficient Recommendation Systems
- [ ] Generalized Embedding Machines for Recommender Systems
- [ ] A Sparse Deep Factorization Machine for Efficient CTR prediction
- [ ] AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations
- [ ] ReZero is All You Need: Fast Convergence at Large Depth
- [ ] Dual-attentional Factorization-Machines based Neural Network for User Response Prediction
- [ ] Deep Match to Rank Model for Personalized Click-Through Rate Prediction
- [ ] Sequential Advertising Agent with Interpretable User Hidden Intents
- [ ] A Dual Input-aware Factorization Machine for CTR Prediction
- [ ] Deep Collaborative Filtering Based on Outer Product
- [ ] MsFcNET: Multi-scale Feature-Crossing Attention Network for Multi-field Sparse Data
- [ ] Controllable Multi-Interest Framework for Recommendation
- [ ] MMCTR: A MULTI-TASK MODEL FOR SHORT VIDEO CTR PREDICTION WITH MULTI-MODAL VIDEO CONTENT FEATURES
- [ ] TRUNCATED SVD-BASED FEATURE ENGINEERING FOR SHORT VIDEO UNDERSTANDING AND RECOMMENDATION
- [ ] Recommending What Video to Watch Next: A Multitask Ranking System
- [ ] Model Ensemble for Click Prediction in Bing Search Ads
- [ ] Field-aware Probabilistic Embedding Neural Network for CTR Prediction
- [ ] FedCTR: Federated Native Ad CTR Prediction with Multi-Platform User Behavior Data
- [ ] AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction
- [ ] DeepLight: Deep Lightweight Feature Interactions for Accelerating CTR Predictions in Ad Serving
- [ ] MiNet: Mixed Interest Network for Cross-Domain Click-Through Rate Prediction
- [ ] Memory-efficient Embedding for Recommendations
- [ ] DNN2LR: Interpretation-inspired Feature Crossing for Real-world Tabular Data
- [ ] TFNet: Multi-Semantic Feature Interaction for CTR Prediction
- [ ] DCN-M: Improved Deep & Cross Network for Feature Cross Learning in Web-scale Learning to Rank Systems
- [ ] LT4REC:A Lottery Ticket Hypothesis Based Multi-task Practice for Video Recommendation System
- [ ] DS-FACTO: Doubly Separable Factorization Machines
- [ ] DEEP RELATIONAL FACTORIZATION MACHINES
- [ ] xDeepInt: a hybrid architecture for modeling the vector-wise and bit-wise feature interactions
- [ ] FIELD-EMBEDDED FACTORIZATION MACHINES FOR CLICK-THROUGH RATE PREDICTION
- [ ] Unbiased Ad Click Prediction for Position-aware Advertising Systems
- [ ] Compact and Computationally Efficient Representation of Deep Neural Networks
- [ ] Dot Product Matrix Compression for Machine Learning
小样本/多尺度Embedding
- [ ] RaFM: Rank-Aware Factorization Machines
- [ ] Neural Input Search for Large Scale Recommendation Models
- [ ] A Meta-Learning Perspective on Cold-Start Recommendations for Items
- [ ] Automated Embedding Size Search in Deep Recommender Systems
- [ ] GMCM: Graph-based Micro-behavior Conversion Model for Post-click Conversion Rate Estimation
- [ ] GateNet:Gating-Enhanced Deep Network for Click-Through Rate Prediction
- [ ] Res-embedding for Deep Learning Based Click-Through Rate Prediction Modeling
- [ ] Task-distribution-aware Meta-learning for Cold-start CTR Prediction
pLTV
- [ ] Ad Recommendation Systems for Life-Time Value Optimization
- [ ] Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models
- [ ] Automatic Representation for Lifetime Value Recommender Systems
- [ ] Customer Lifetime Value Prediction in Non-Contractual Freemium Settings: Chasing High-Value Users Using Deep Neural Networks and SMOTE
- [ ] Modeling and Application of Customer Lifetime Value in Online Retail
多任务
- [ ] Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate
- [ ] Predicting Different Types of Conversions with Multi-Task Learning in Online Advertising
- [ ] Deep Bayesian Multi-Target Learning for Recommender Systems
- [ ] A Causal Perspective to Unbiased Conversion Rate Estimation on Data Missing Not at Random
- [ ] MULTI-LOSS WEIGHTING WITH COEFFICIENT OF VARIATIONS
- [ ] Multi-Task Learning as Multi-Objective Optimization
- [ ] GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
- [ ] Efficient Continuous Pareto Exploration in Multi-Task Learning
- [ ] Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
- [ ] Learning to Compare: Relation Network for Few-Shot Learning
- [ ] An Overview of Multi-Task Learning in Deep Neural Network
- [ ] A Pareto-Eficient Algorithm for Multiple Objective Optimization in E-Commerce Recommendation
- [ ] Learning Task Grouping and Overlap in Multi-Task Learning
- [ ] Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
- [ ] Accelerating Matrix Factorization by Overparameterization
- [ ] Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations
- [ ] Rotograd: Dynamic Gradient Homogenization for Multi-Task Learning(https://arxiv.org/pdf/2103.02631.pdf)
多任务相关性
- [ ] A Principled Approach for Learning Task Similarity in Multitask Learning
- [ ] Probabilistic Lipschitzness (PL) condition
延迟反馈
- [ ] Modeling Delayed Feedback in Display Advertising
- [ ] A Nonparametric Delayed Feedback Model for Conversion Rate Prediction
- [ ] A Practical Framework of Conversion Rate Prediction for Online Display Advertising
- [ ] * Addressing Delayed Feedback for Continuous Training with Neural Networks in CTR prediction
- [ ] Unbiased Learning to Rank with Unbiased Propensity Estimation
- [ ] * Dual Learning Algorithm for Delayed Feedback in Display Advertising
- [ ] A Feedback Shift Correction in Predicting Conversion Rates under Delayed Feedback
- [ ] An Attention-based Model for CVR with Delayed Feedback via Post-Click Calibration
- [ ] Addressing Delayed Feedback for Continuous Training with Neural Networks in CTR prediction
[二] 优化算法
综述
- [ ] An overview of gradient descent optimization algorithms
- [ ] A Survey of Optimization Methods from a Machine Learning Perspective
- [ ] Introduction to Online Convex Optimization (book)
一阶优化
- [ ] Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
- [ ] RmsProp: Overview of mini-batch gradient descent
- [ ] ADADELTA: AN ADAPTIVE LEARNING RATE METHOD
- [ ] ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION
- [ ] DECOUPLED WEIGHT DECAY REGULARIZATION
二阶优化
- [ ] Optimizing Neural Networks with Kronecker-factored Approximate Curvature
- [ ] Shampoo: Preconditioned Stochastic Tensor Optimization
- [ ] Second Order Optimization Made Practical
累积后悔最小化
- [ ] Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
- [x] Ad Click Prediction: a View from the Trenches
- [ ] *Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty
- [ ] Deep online learning via meta-learning: Continual adaptation for model-based RL
- [ ] Online Learning: A Comprehensive Survey
- [x] Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization
- [x] Follow the Moving Leader in Deep Learning
- [ ] Online Meta-Learning
方差约减
- [x] *Lookahead Optimizer: k steps forward, 1 step back
- [ ] On the Ineffectiveness of Variance Reduced Optimization for Deep Learning
- [x] Accelerating Stochastic Gradient Descent using Predictive Variance Reduction
梯度延迟
- [ ] Delay-Tolerant Algorithms for Asynchronous Distributed Online Learning
- [ ] *Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD
- [ ] DC-S3GD: Delay-Compensated Stale-Synchronous SGD for Large-Scale Decentralized Neural Network Training
- [ ] The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication
- [ ] Asynchronous Stochastic Gradient Descent with Delay Compensation
- [ ] An Attention-based Model for Conversion Rate Prediction with Delayed Feedback via Post-click Calibration
其他
- [ ] A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets
- [ ] Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
- [ ] Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter
- [ ] Natasha 2: Faster Non-Convex Optimization Than SGD
- [ ] Training Neural Networks for and by Interpolation (线性差值)
- [ ] *Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
- [ ] Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
- [ ] Follow the Leader: Theory and Applications (ppt)
- [ ] Stochastic Gradient Descent as Approximate Bayesian Inference
- [ ] Gradient descent with momentum — to accelerate or to super-accelerate?
- [ ] Visualizing the Loss Landscape of Neural Nets (理解NN LOSS)
- [ ] Adaptive Serverless Learning (去中心化sgd训练,也许有一些思路)
- [ ] Error Compensated Distributed SGD Can Be Accelerated
- [ ] DECOUPLED WEIGHT DECAY REGULARIZATION(weight decay和L2正则的一些思考)
- [ ] * AdaBelief(https://github.com/juntang-zhuang/Adabelief-Optimizer)
- [ ] FIXING WEIGHT DECAY REGULARIZATION IN ADAM
- [ ] []
auc maximization
- [ ] Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate
- [ ] Stochastic Proximal Algorithms for AUC Maximization
- [ ] Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
- [ ] Online AUC maximization
- [ ] FAST OPTIMIZATION ALGORITHMS FOR AUC MAXIMIZATION
[三] 贝叶斯推断(todo)
- [ ] Matchbox: Large Scale Online Bayesian Recommendations
- [ ] Web-Scale Bayesian Click-Through Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine
- [ ] Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks
- [x] PBODL : Parallel Bayesian Online Deep Learning for Click-Through Rate Prediction in Tencent Advertising System
[四] 特征构建(todo)
- [ ] Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba
- [ ] Real-time Personalization using Embeddings for Search Ranking at Airbnb
[五] 图像
- [ ] CNN features off-the-shelf: an astounding baseline for recognition
- [ ] ImageNet Classification with Deep Convolutional Neural Networks
- [ ] Deeply learned face representations are sparse, selective, and robust (PCA降维)
- [ ] Particular object retrieval with integral max-pooling of CNN activations
- [ ] Aggregating Deep Convolutional Features for Image Retrieval (SPoC)
- [ ] Deep Supervised Hashing for Fast Image Retrieval (DSH)
- [ ] Dimensionality reduction by learning an invariant mapping (Contrastive Loss)
- [ ] FaceNet: A Unified Embedding for Face Recognition and Clustering (Triplet Loss)
- [ ] Deep metric learning via lifted structured feature embedding (Lifted Structure Loss)
- [ ] Learning deep embeddingswith histogram loss (Histogram Loss)
- [ ] Largescale image retrieval with attentive deep local features (Spatial-wise Attention)
- [ ] Squeeze-and-excitation networks (SENET)
- [ ] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning (SA+CA Attention)
[七] RTB(todo)
- [ ] Online Second Price Auction with Semi-bandit Feedback Under the Non-Stationary Setting
- [ ] Smart Targeting: A Relevance-driven and Configurable Targeting Framework for Advertising System
[八] 机器学习理论本质
- [ ] Optimization Problems for Machine Learning: A Survey
- [ ] Correct Normalization Matters:Understanding the Effect of Normalization On Deep Neural Network Models For CTR Prediction
- [ ] Why ResNet Works? Residuals Generalize(残差网络有效性分析)
- [ ] Visualizing the Loss Landscape of Neural Nets(残差网络可视化https://github.com/tomgoldstein/loss-landscape)
- [ ] Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
- [ ] On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
- [ ] Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
- [ ] Exploring Generalization in Deep Learning
- [ ] Interpreting neural network judgments via minimal, stable, and symbolic corrections
- [ ] Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
- [] Convergence Analysis of Two-layer Neural Networks with ReLU Activation
[] 冷启动
- [ ] Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation(tencent 迁移学习冷启动)
[] 联邦学习
- [ ] TOWARDS FEDERATED LEARNING AT SCALE: SYSTEM DESIGN
- [ ] From Federated Learning to Fog Learning: Towards Large-Scale Distributed Machine Learning in Heterogeneous Wireless Networks
- [ ] How To Backdoor Federated Learning
- [ ] FedDistill: Making Bayesian Model Ensemble Applicable to Federated Learning
- [ ] Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints
[] 内容推荐
- [ ] Deep Neural Networks for YouTube Recommendations
- [ ] Latent Cross: Making Use of Context in Recurrent Recommender Systems
[] 召回
- [ ] [MIND召回]
[] 不确定性预估
- [ ] Simple and scalable predictive uncertainty estimation using deep ensembles
- [ ] Countdown Regression: Sharp and Calibrated Survival Predictions
- [ ] Probabilistic Forecasting with Spline Quantile Function RNNs
[] ranking loss
替代或者与ce loss融合,
- [ ] Improving Recommendation Quality in Google Drive
- [ ] [Improving Deep Learning For Airbnb Search]
- [ ] Learning to Rank using Gradient Descent
- [ ] BPR: Bayesian Personalized Ranking from Implicit Feedback
[] debias
- [ ] Learning to rank with selection bias in personal search