pykokkos
pykokkos copied to clipboard
TeamPolicy gpu parallel for hangs on more that 5 serial calls.
I am running into an issue that when I am calling a pk.parallel_for loop sequentially on a gpu more than 5 time in a row, the code hangs.
Here is a minimal example.
import cupy as cp
import pykokkos as pk
@pk.workunit
def work(team_member, view):
j: int = team_member.league_rank()
k: int = team_member.team_size()
def inner(i: int):
view[j * k + i] = view[j * k + i] + 1
pk.parallel_for(pk.TeamThreadRange(team_member, k), inner)
def main():
pk.set_default_space(pk.Cuda)
a = cp.zeros(100)
for r in range(6):
pk.parallel_for("work", pk.TeamPolicy(50, 2), work, view=a)
print(r)
print(a)
main()
On line 17, when range is set to 5, the code executes, but when range is set to 6, the code fails.