triton
triton copied to clipboard
Fatal Python error: Aborted
Hi,
I was trying to test https://github.com/NVIDIA/apex/tree/master/apex/contrib/openfold_triton
with triton but encountered this error and cannot find the solution anywhere. It'd be great if I could get some pointers to check which part I did wrong.
Fatal Python error: Aborted
Current thread 0x00007f7325dfa280 (most recent call first):
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/compiler.py", line 1006 in ttgir_to_llir
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/compiler.py", line 1554 in <lambda>
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/compiler.py", line 1621 in compile
File "<string>", line 41 in _attention_core
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 199 in run
File "/home/rui/code/test-fold-dev/test-fold/ops/attention/triton/mha_fwd.py", line 476 in attn_core_fwd
File "/home/rui/code/test-fold-dev/tests/unit/ops/attention/flash_attention_test.py", line 18 in test_flash_attention_fwd
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/python.py", line 194 in pytest_pyfunc_call
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/python.py", line 1792 in runtest
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 169 in pytest_runtest_call
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 262 in <lambda>
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 261 in call_runtest_hook
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 222 in call_and_report
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 133 in runtestprotocol
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/runner.py", line 114 in pytest_runtest_protocol
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 350 in pytest_runtestloop
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 325 in _main
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 271 in wrap_session
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/main.py", line 318 in pytest_cmdline_main
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_callers.py", line 77 in _multicall
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_manager.py", line 115 in _hookexec
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/pluggy/_hooks.py", line 493 in __call__
File "/home/rui/anaconda3/envs/test-fold1218/lib/python3.10/site-packages/_pytest/config/__init__.py", line 169 in main
File "/home/rui/.pycharm_helpers/pycharm/_jb_pytest_runner.py", line 60 in <module>
Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, matplotlib._c_internal_utils, PIL._imaging, matplotlib._path, kiwisolver._cext, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, psutil._psutil_linux, psutil._psutil_posix, lmdb.cpython, Bio.PDB.ccealign, Bio.SeqIO._twoBitIO, yaml._yaml, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tslibs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pandas._libs.ops, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.indexing, pandas._libs.index, pandas._libs.internals, pandas._libs.join, pandas._libs.writers, pandas._libs.window.aggregations, pandas._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, PIL._imagingft, charset_normalizer.md, matplotlib._image, google._upb._message, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.linalg._flinalg, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._statlib, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont (total: 174)
Process finished with exit code 134 (interrupted by signal 6:SIGABRT)
My environment is just pip install torch==2.0.1
with cuda 11.7. The testing sample is the basic one with query shape [256, 4, 256, 16]
.
In the past, I have already encountered this error which I could circumvent if I adjust the way the grid is defined--if I make the grid 1D, it works just fine, but not grid 2D. However, this is getting complicated with attention and stuff and I was wondering what could be the correct fix?
Best.
Hi, sorry to hear you're encountering an issue.
Please try with Triton built from head.
If that still does not work, please attach steps to reproduce, and someone might be able to have a look.