pgx
pgx copied to clipboard
๐ฒ Vectorized RL game environments in JAX
PGX on TPUs seems to be slower than CPUs. With a TPU v3-8, PGX is only achieving 1638 steps / sec on the game of chess. **Minimal Reproducible Example** PGX...
### Problem Description In examples/alphazero/train.py, we compute `value_mask` as follows: https://github.com/sotetsuk/pgx/blob/87278d2d6e677fd87248c457207b59cfa42e578d/examples/alphazero/train.py#L179 The purpose is to avoid updating the critic network on incomplete trajectories, as is evident by masking of value...
added go_13x13 env,. did not add hosi_pos and tests.
็ด ๆดใใใใฉใคใใฉใชใใ้็บใใใ ใใใใใใจใใใใใพใใใใคใๆฅฝใใใงไฝฟใใใฆใใใ ใใฆใใใพใใ ## ๆๆก ไปฅไธใฎ้ใใshogi ใใญใฅใกใณใใฎ action ใฎ direction ใฎ่ชฌๆใใใ`direction` **from** which the piece moves andใใใใ`direction` **to** which the piece moves andใใซๅคๆดใใใใจใๆๆกใใพใใ ### ๅคๆดๅ ``` There are `2187 = 81 x...
Hi, shouldn't the first observation in Kuhn Poker have all zeros in the betting part of the vector? For instance, if player 1 gets a Q, then the observation should...
A bit later than I meant to, but this addresses https://github.com/sotetsuk/pgx/issues/1059. Pretty straightforward change to equinox style NNs. Some minor speed differences that could be optimized (see https://github.com/patrick-kidger/equinox/issues/928, https://github.com/patrick-kidger/equinox/issues/926), but...