StarCraft II Imitation Learning

This repository provides code to train neural network based StarCraft II agents from human demonstrations. It emerged as a side-product of my Master's thesis, where I looked at representation learning from demonstrations for task transfer in reinforcement learning.

The main features are:

Behaviour cloning from StarCraft II replays
Modular and extensible agents, inspired by the architecture of AlphaStar but using the feature-layer interface instead of the raw game interface
Hierarchical configurations using Gin Config that provide great degree of flexibility and configurability
Pre-processing of large-scale replay datasets
Multi-GPU training
Playing against trained agents (Windows / Mac)
Pretrained agents for the Terran vs Terran match-up

Installation
Train your own agent
Play against trained agents
Download pre-trained agents

Installation

Requirements

Python >= 3.6
StarCraft II >= 3.16.1 (4.7.1 strongly recommended)

To install StarCraft II, you can follow the instructions at https://github.com/deepmind/pysc2#get-starcraft-ii.

On Linux: From the available versions, version 4.7.1 is strongly recommended. Other versions are not tested and might run into compatibility issues with this code or the PySC2 library. Also, replays are tied to the StarCraft II version in which they were recorded, and of all the binaries available, version 4.7.1 has the largest number of replays currently available through the Blizzard Game Data APIs.

On Windows/MacOS: The binaries for a certain game version will be downloaded automatically when opening a replay of that version via the game client.

Get the StarCraft II Maps

Download the ladder maps and extract them to the StarCraftII/Maps/ directory.

Get the Code

git clone https://github.com/metataro/sc2_imitation_learning.git

Install the Python Libraries

pip install -r requirements.txt

Train Your Own Agent

Download Replay Packs

There are replay packs available for direct download, however, a much larger number of replays can be downloaded via the Blizzard Game Data APIs.

The download of StarCraft II replays from the Blizzard Game Data APIs is described here. For example, the following command will download all available replays of game version 4.7.1:

python -m scripts.download_replays \
  --key <API_KEY> \
  --secret <API_SECRET> \
  --version 4.7.1 \
  --extract \
  --filter_version sort

Prepare the Dataset

Having downloaded the replay packs, you can preprocess and combine them into a dataset as follows:

python -m scripts.build_dataset \
  --gin_file ./configs/1v1/build_dataset.gin \
  --replays_path ./data/replays/4.7.1/ \
  --dataset_path ./data/datasets/v1

Note that depending on the configuration, the resulting dataset may require large amounts of disk space (> 1TB). For example, the configuration defined in ./configs/1v1/build_dataset.gin results in a dataset with the size of about 4.5TB, although only less than 5% of the 4.7.1 replays are used.

Run the Training

After preparing the dataset, you can run behaviour cloning training as follows:

python -m scripts.behaviour_cloning --gin_file ./configs/1v1/behaviour_cloning.gin

By default, the training will be parallelized across all available GPUs. You can limit the number of used GPUs by setting the environment variable CUDA_VISIBLE_DEVICES.

The parameters in configs/1v1/behaviour_cloning.gin are optimized for a hardware setup with four Nvidia GTX 1080Ti GPUs and 20 physical CPUs (40 logical CPUs), where the training takes around one week to complete. You may need to adjust these configurations to fit your hardware specifications.

Logs are written to a tensoboard log file inside the experiment directory. You can additionally enable logging to Weights & Biases by setting the --wandb_logging_enabled flag.

Run the Evaluation

You can evaluate trained agents against built-in A.I. as follows:

python -m scripts.evaluate --gin_file configs/1v1/evaluate.gin --logdir <EXPERIMENT_PATH>

Replace <EXPERIMENT_PATH> with the path to the experiment folder of the agent. This will run the evaluation as configured in configs/1v1/evaluate.gin. Again, you may need to adjust these configurations to fit your hardware specifications.

By default, all available GPUs will be considered and evaluators will be split evenly across them. You can limit the number of used GPUs by setting the environment variable CUDA_VISIBLE_DEVICES.

Play Against Trained Agents

You can challenge yourself to play against trained agents.

First, start a game as human player:

python -m scripts.play_agent_vs_human --human

Then, in a second console, let the agent join the game:

python -m scripts.play_agent_vs_human --agent_dir <SAVED_MODEL_PATH>

Replace <SAVED_MODEL_PATH> with the path to the where the model is stored (e.g. /path/to/experiment/saved_model).

Download Pre-Trained Agents

There are pre-trained agents available for download:

https://drive.google.com/drive/folders/1PNhOYeA4AkxhTzexQc-urikN4RDhWEUO?usp=sharing

Agent 1v1/tvt_all_maps

Evaluation Results

The table below shows the win rates of the agent when evaluated in TvT against built-in AI with randomly selected builds. Win rate for each map and difficulty level were determined by 100 evaluation matches.

Map	Very Easy	Easy	Medium	Hard
KairosJunction	0.86	0.27	0.07	0.00
Automaton	0.82	0.33	0.07	0.00
Blueshift	0.84	0.41	0.03	0.00
CeruleanFall	0.72	0.28	0.03	0.00
ParaSite	0.75	0.41	0.02	0.01
PortAleksander	0.72	0.34	0.05	0.00
Stasis	0.73	0.44	0.08	0.00
Overall	0.78	0.35	0.05	~ 0.00

Recordings

Video recordings of cherry-picked evaluation games:

Midgame win vs easy A.I.	Marine rush win vs easy A.I.
Basetrade win vs hard A.I.

Training Data

Matchups	TvT
Minimum MMR	3500
Minimum APM	60
Minimum duration	30
Maps	KairosJunction, Automaton, Blueshift, CeruleanFall, ParaSite, PortAleksander, Stasis
Episodes	35'051 (102'792'317 timesteps)

Interface

Interface type	Feature layers
Dimensions	64 x 64 (screen), 64 x 64 (minimap)
Screen features	visibility_map, player_relative, unit_type, selected, unit_hit_points_ratio, unit_energy_ratio, unit_density_aa
Minimum features	camera, player_relative, alerts
Scalar features	player, home_race_requested, away_race_requested, upgrades, game_loop, available_actions, unit_counts, build_queue, cargo, cargo_slots_available, control_groups, multi_select, production_queue

Agent Architecture

SC2 Featuer Layer Agent Architecture

sc2_imitation_learning
sc2_imitation_learning copied to clipboard

Metadata

StarCraft II Imitation Learning

Table of Contents

Installation

Requirements

Get the StarCraft II Maps

Get the Code

Install the Python Libraries

Train Your Own Agent

Download Replay Packs

Prepare the Dataset

Run the Training

Run the Evaluation

Play Against Trained Agents

Download Pre-Trained Agents

Agent 1v1/tvt_all_maps

Evaluation Results

Recordings

Training Data

Interface

Agent Architecture

← Metadata

Owner

Metadata

sc2_imitation_learning sc2_imitation_learning copied to clipboard

Metadata

StarCraft II Imitation Learning

Table of Contents

Installation

Requirements

Get the StarCraft II Maps

Get the Code

Install the Python Libraries

Train Your Own Agent

Download Replay Packs

Prepare the Dataset

Run the Training

Run the Evaluation

Play Against Trained Agents

Download Pre-Trained Agents

Agent 1v1/tvt_all_maps

Evaluation Results

Recordings

Training Data

Interface

Agent Architecture

← Metadata

Owner

Metadata

sc2_imitation_learning
sc2_imitation_learning copied to clipboard