bifrost icon indicating copy to clipboard operation
bifrost copied to clipboard

Intermittent `bifrost.linalg` test failures

Open jaycedowell opened this issue 3 years ago • 2 comments

Occasionally we see test failures on the self-hosted bifrost.linalg suite. Now that I'm looking for one to point to I cannot find one.

jaycedowell avatar Oct 17 '22 19:10 jaycedowell

Here's one: https://github.com/ledatelescope/bifrost/pull/167#issuecomment-1152494636

jaycedowell avatar Oct 17 '22 19:10 jaycedowell

I wonder if this is somehow related to #210. The only places where BF_STATUS_UNSUPPORTED_SHAPE can be thrown from a LinAlg call are in linalg_kernels.cu:

  • bf_cherk_N
  • bf_cgemm_TN_smallM_staticN_v2
  • bf_cgemm_TN_smallM

These are all kind of trivial though. It's mostly value checking for the matrix shape. There are a couple of comparisons of the batch size with the texture memory size that can also throw this. It would be nice to know exactly which BF_STATUS_UNSUPPORTED_SHAPE we are hitting.

jaycedowell avatar Jun 16 '23 20:06 jaycedowell