Michael Hu

Results 5 repositories owned by Michael Hu

alpha_zero

42
Stars
10
Forks
Watchers

A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games

deep_rl_zoo

102
Stars
9
Forks
Watchers

A collection of Deep Reinforcement Learning algorithms implemented with PyTorch to solve Atari games and classic control tasks like CartPole, LunarLander, and MountainCar.

muzero

23
Stars
3
Forks
Watchers

A PyTorch implementation of DeepMind's MuZero agent

InstructLLaMA

55
Stars
13
Forks
55
Watchers

Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructG...

Llama3-FunctionCalling

46
Stars
7
Forks
46
Watchers

Fine-tune Llama3 model to support function calling