contextual-bandits topic
LinUCB
Contextual bandit algorithm called LinUCB / Linear Upper Confidence Bounds as proposed by Li, Langford and Schapire
sinkhorn-policy-gradient.pytorch
Code accompanying the paper "Learning Permutations with Sinkhorn Policy Gradient"
FairMachineLearning
Implementation of provably Rawlsian fair ML algorithms for contextual bandits.
blocks
Blocks World -- Simulator, Code, and Models (Misra et al. EMNLP 2017)
MiniVox
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".
banditml
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
multi-armed-bandits-for-recommendation-systems
implement basic and contextual MAB algorithms for recommendation system
python-ranker
Contextual Multi-Armed Bandit Platform for Scoring, Ranking & Decisions
Neural-Thompson-Sampling
Study of the paper 'Neural Thompson Sampling' published in October 2020