Michael Hu

Results 5 repositories owned by Michael Hu

alpha_zero

42
Stars
10
Forks
Watchers

A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games

deep_rl_zoo

102
Stars
9
Forks
Watchers

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.

muzero

23
Stars
3
Forks
Watchers

A PyTorch implementation of DeepMind's MuZero agent

InstructLLaMA

41
Stars
9
Forks
Watchers

Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructG...

Llama3-FunctionCalling

22
Stars
1
Forks
Watchers

Fine-tune Llama3 model to support function calling