pykokkos icon indicating copy to clipboard operation
pykokkos copied to clipboard

TeamPolicy gpu parallel for hangs on more that 5 serial calls.

Open kennykos opened this issue 7 months ago • 5 comments

I am running into an issue that when I am calling a pk.parallel_for loop sequentially on a gpu more than 5 time in a row, the code hangs.

Here is a minimal example.

import cupy as cp
import pykokkos as pk

@pk.workunit
def work(team_member, view):
    j: int = team_member.league_rank()
    k: int = team_member.team_size()

    def inner(i: int):
        view[j * k + i] = view[j * k + i] + 1

    pk.parallel_for(pk.TeamThreadRange(team_member, k), inner)

def main():
    pk.set_default_space(pk.Cuda)
    a = cp.zeros(100)
    for r in range(6):
        pk.parallel_for("work", pk.TeamPolicy(50, 2), work, view=a)
        print(r)
    print(a)

main()

On line 17, when range is set to 5, the code executes, but when range is set to 6, the code fails.

kennykos avatar Jul 11 '24 15:07 kennykos