pgx icon indicating copy to clipboard operation
pgx copied to clipboard

๐ŸŽฒ Vectorized RL game environments in JAX

Results 46 pgx issues
Sort by recently updated
recently updated
newest added

PGX on TPUs seems to be slower than CPUs. With a TPU v3-8, PGX is only achieving 1638 steps / sec on the game of chess. **Minimal Reproducible Example** PGX...

### Problem Description In examples/alphazero/train.py, we compute `value_mask` as follows: https://github.com/sotetsuk/pgx/blob/87278d2d6e677fd87248c457207b59cfa42e578d/examples/alphazero/train.py#L179 The purpose is to avoid updating the critic network on incomplete trajectories, as is evident by masking of value...

added go_13x13 env,. did not add hosi_pos and tests.

็ด ๆ™ดใ‚‰ใ—ใ„ใƒฉใ‚คใƒ–ใƒฉใƒชใ‚’ใ”้–‹็™บใ„ใŸใ ใใ€ใ‚ใ‚ŠใŒใจใ†ใ”ใ–ใ„ใพใ™ใ€‚ใ„ใคใ‚‚ๆฅฝใ—ใ‚“ใงไฝฟใ‚ใ›ใฆใ„ใŸใ ใ„ใฆใŠใ‚Šใพใ™ใ€‚ ## ๆๆกˆ ไปฅไธ‹ใฎ้€šใ‚Šใ€shogi ใƒ‰ใ‚ญใƒฅใƒกใƒณใƒˆใฎ action ใฎ direction ใฎ่ชฌๆ˜Žใ‚’ใ€ใ€Œ`direction` **from** which the piece moves andใ€ใ‹ใ‚‰ใ€Œ`direction` **to** which the piece moves andใ€ใซๅค‰ๆ›ดใ™ใ‚‹ใ“ใจใ‚’ๆๆกˆใ—ใพใ™ใ€‚ ### ๅค‰ๆ›ดๅ‰ ``` There are `2187 = 81 x...

Hi, shouldn't the first observation in Kuhn Poker have all zeros in the betting part of the vector? For instance, if player 1 gets a Q, then the observation should...

A bit later than I meant to, but this addresses https://github.com/sotetsuk/pgx/issues/1059. Pretty straightforward change to equinox style NNs. Some minor speed differences that could be optimized (see https://github.com/patrick-kidger/equinox/issues/928, https://github.com/patrick-kidger/equinox/issues/926), but...