yaksa icon indicating copy to clipboard operation
yaksa copied to clipboard

bug/jenkins: cuda memory error in nightly tests

Open hzhou opened this issue 4 years ago • 0 comments

GPU testing on Jenkins intermittently shows CUDA memory errors. For example, one of the nightly gpu test (https://jenkins-pmrs.cels.anl.gov/view/yaksa/job/yaksa-nightly-gpu/lastCompletedBuild/testReport/):

test/pack/pack -datatype int -count 17 -seed 73 -iters 32768 -segments 1 -ordering normal -overlap none -num-threads 4
 Stack Trace

CUDA Error (yaksuri_cudai_event_query:src/backend/cuda/pup/yaksuri_cudai_event.c,65): an illegal memory access was encountered
lt-pack: test/pack/pack.c:135: runtest: Assertion `dbuf_h' failed.
CUDA Error (yaksuri_cudai_type_free_hook:src/backend/cuda/hooks/yaksuri_cudai_type_hooks.c,92): an illegal memory access was encountered
lt-pack: test/pack/pack.c:249: runtest: Assertion `rc == (0)' failed.
lt-pack: test/pack/pack.c:108: runtest: Assertion `sbuf_h' failed.
      

hzhou avatar Jan 18 '21 14:01 hzhou