Michael Hu

Results 5 repositories owned by


                                            Michael Hu

alpha_zero

Stars

Forks

Watchers

A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games

michaelnny

alphago

alphago-zero

alphazero

deep_rl_zoo

102

Stars

Forks

Watchers

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.

michaelnny

actor-critic

agent57

c51

deep-reinforcement-learning

muzero

Stars

Forks

Watchers

A PyTorch implementation of DeepMind's MuZero agent

michaelnny

alphazero

model-based-rl

muzero

pytorch

Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructG...

michaelnny

4bit-fine-tune

instructgpt

llam2

ppo