ImportError: DLL load failed while importing slstm_HS64BS8NH1NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0: 找不到指定的模块。
'verbose': True, 'with_cuda': True, 'extra_ldflags': ['-LD:\CUDA_manager\CUDA12.0\CUDA1\lib', '-lcublas'], 'extra_cflags': ['-DSLSTM_HIDDEN_SIZE=64', '-DSLSTM_BATCH_SIZE=8', '-DSLSTM_NUM_HEADS=1', '-DSLSTM_NUM_STATES=4', '-DSLSTM_DTYPE_B=float', '-DSLSTM_DTYPE_R=nv_bfloat16', '-DSLSTM_DTYPE_W=nv_bfloat16', '-DSLSTM_DTYPE_G=nv_bfloat16', '-DSLSTM_DTYPE_S=nv_bfloat16', '-DSLSTM_DTYPE_A=float', '-DSLSTM_NUM_GATES=4', '-DSLSTM_SIMPLE_AGG=true', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL_VALID=false', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL=0.0', '-DSLSTM_FORWARD_CLIPVAL_VALID=false', '-DSLSTM_FORWARD_CLIPVAL=0.0', '-U__CUDA_NO_HALF_OPERATORS', '-U__CUDA_NO_HALF_CONVERSIONS', '-U__CUDA_NO_BFLOAT16_OPERATORS', '-U__CUDA_NO_BFLOAT16_CONVERSIONS', '-U__CUDA_NO_BFLOAT162_OPERATORS__', '-U__CUDA_NO_BFLOAT162_CONVERSIONS__'], 'extra_cuda_cflags': ['-Xptxas="-v"', '-gencode', 'arch=compute_80,code=compute_80', '-res-usage', '--use_fast_math', '-O3', '-Xptxas -O3', '--extra-device-vectorization', '-DSLSTM_HIDDEN_SIZE=64', '-DSLSTM_BATCH_SIZE=8', '-DSLSTM_NUM_HEADS=1', '-DSLSTM_NUM_STATES=4', '-DSLSTM_DTYPE_B=float', '-DSLSTM_DTYPE_R=nv_bfloat16', '-DSLSTM_DTYPE_W=nv_bfloat16', '-DSLSTM_DTYPE_G=nv_bfloat16', '-DSLSTM_DTYPE_S=nv_bfloat16', '-DSLSTM_DTYPE_A=float', '-DSLSTM_NUM_GATES=4', '-DSLSTM_SIMPLE_AGG=true', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL_VALID=false', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL=0.0', '-DSLSTM_FORWARD_CLIPVAL_VALID=false', '-DSLSTM_FORWARD_CLIPVAL=0.0', '-U__CUDA_NO_HALF_OPERATORS', '-U__CUDA_NO_HALF_CONVERSIONS', '-U__CUDA_NO_BFLOAT16_OPERATORS', '-U__CUDA_NO_BFLOAT16_CONVERSIONS', '-U__CUDA_NO_BFLOAT162_OPERATORS__', '-U__CUDA_NO_BFLOAT162_CONVERSIONS__']}
Using C:\Users\a\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu118 as PyTorch extensions root...
Creating extension directory C:\Users\a\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu118\slstm_HS64BS8NH1NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0...
Detected CUDA files, patching ldflags
Emitting ninja build file C:\Users\a\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu118\slstm_HS64BS8NH1NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0\build.ninja...
Building extension module slstm_HS64BS8NH1NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.11.1.git.kitware.jobserver-1
Loading extension module slstm_HS64BS8NH1NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0...
Traceback (most recent call last):
File "C:\Users\a\Downloads\xlstm-main\xlstm-main\experiments\main.py", line 155, in
I think slstm is not work,I can run parity_xlstm01.yaml,but 11 is not work
have you solved it?
Have you tried with the updated conda environment?
Have same question
how to address this problem
Using another github repo to replace this maybe a good way.
xunyoyo @.***