We should provide some algorithms for Model-Free Reinforcement Learning:
This article starts with PPO.