MIOpen frugally-deep 0.16.0 appears to break kernel/model files

I recently updated my AI workflow to ROCm 6.3.2 on Arch Linux, and found that some PyTorch operations were crashing with "MIOpen Error: tensor_shape_variable needs to be an array". With a bit of debugging, I was able to narrow it down to fdeep::internal::create_tensor_shape_variable_offset getting an incorrect parameter. I looked around the source of frugally-deep and the model it was loading a bit, and noticed that fdeep was looking for batch_shape, while the model file used batch_input_shape.

This change in 0.16.0 appears to be causing this specific issue: https://github.com/Dobiasd/frugally-deep/commit/a60717c21188b710f56c457370b6f05cc70af435#diff-a674970aa0b9e26d68cc8783ce1aa3f82425780a062969020febb6fda1371701L500-R507 The change modified the expected key from batch_input_shape to batch_shape. However, after fixing that, I found that inbound_nodes now has a significantly different structure as well, also shown in the above commit. I'm not well versed in the inner workings of this stuff, but I'm guessing there's a new file format with TensorFlow 2.16.1 that breaks the old files, and fdeep's update changes it to use that format instead.

The files src/kernels/gfx9[08|0a|42].tn.model will need to be updated to this new format to support frugally-deep 0.16.0 when built with MIOPEN_ENABLE_AI_KERNEL_TUNING (which is default). I'd update it myself in a PR if I knew the format, and was confident it fixed the issue without causing problems, but that is not the case.

Mar 08 '25 13:03 MCJack123

I'm also getting this error: "MIOpen Error: tensor_shape_variable needs to be an array"

I get the error whenever using torch.nn.functional.conv2d. Here's an output with MIOpen logging turned on when running a minimal program to trigger the error:

MIOpen(HIP): Info [get_device_name] Raw device name: gfx1102
MIOpen(HIP): Info [Handle] stream: 0, device_id: 0
MIOpen(HIP): Info [get_device_name] Raw device name: gfx1102
MIOpen(HIP): Info [SetStream] stream: 0, device_id: 0
MIOpen(HIP): miopenStatus_t miopenCreateTensorDescriptor(miopenTensorDescriptor_t *){
MIOpen(HIP):    tensorDesc = 0x7ffc6f739908
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetTensorDescriptor(miopenTensorDescriptor_t, miopenDataType_t, int, const int *, const int *){
MIOpen(HIP):    tensorDesc = {}, {}, packed,
MIOpen(HIP):    dataType = 1
MIOpen(HIP):    nbDims = 4
MIOpen(HIP):    dim.values = { 32 4 32 32 }
MIOpen(HIP):    stride.values = { 4096 1024 32 1 }
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenCreateTensorDescriptor(miopenTensorDescriptor_t *){
MIOpen(HIP):    tensorDesc = 0x56a9c37153c0
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetTensorDescriptor(miopenTensorDescriptor_t, miopenDataType_t, int, const int *, const int *){
MIOpen(HIP):    tensorDesc = {}, {}, packed,
MIOpen(HIP):    dataType = 1
MIOpen(HIP):    nbDims = 4
MIOpen(HIP):    dim.values = { 32 4 3 3 }
MIOpen(HIP):    stride.values = { 36 9 3 1 }
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenCreateTensorDescriptor(miopenTensorDescriptor_t *){
MIOpen(HIP):    tensorDesc = 0x7e3015452807
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetTensorDescriptor(miopenTensorDescriptor_t, miopenDataType_t, int, const int *, const int *){
MIOpen(HIP):    tensorDesc = {}, {}, packed,
MIOpen(HIP):    dataType = 1
MIOpen(HIP):    nbDims = 4
MIOpen(HIP):    dim.values = { 32 32 30 30 }
MIOpen(HIP):    stride.values = { 28800 900 30 1 }
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenCreateConvolutionDescriptor(miopenConvolutionDescriptor_t *){
MIOpen(HIP):    convDesc = 0x100
MIOpen(HIP): }
MIOpen(HIP): Info [] MIOPEN_FIND_MODE = DYNAMIC_HYBRID(5)
MIOpen(HIP): miopenStatus_t miopenInitConvolutionNdDescriptor(miopenConvolutionDescriptor_t, int, const int *, const int *, const int *, miopenConvolutionMode_t){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {1, 1},
MIOpen(HIP):    spatialDim = 2
MIOpen(HIP):    pads = { 0 0 }
MIOpen(HIP):    strides = { 1 1 }
MIOpen(HIP):    dilations = { 1 1 }
MIOpen(HIP):    c_mode = 0
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetConvolutionGroupCount(miopenConvolutionDescriptor_t, int){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {1, 1},
MIOpen(HIP):    groupCount = 1
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetConvolutionAttribute(miopenConvolutionDescriptor_t, const miopenConvolutionAttrib_t, const int){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {1, 1},
MIOpen(HIP):    attr = 1
MIOpen(HIP):    value = 0
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenConvolutionForwardGetWorkSpaceSize(miopenHandle_t, const miopenTensorDescriptor_t, const miopenTensorDescriptor_t, const miopenConvolutionDescriptor_t, const miopenTensorDescriptor_t, size_t *){
MIOpen(HIP):    handle = stream: 0, device_id: 0
MIOpen(HIP):    wDesc = {32, 4, 3, 3}, {36, 9, 3, 1}, packed,
MIOpen(HIP):    xDesc = {32, 4, 32, 32}, {4096, 1024, 32, 1}, packed,
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {1, 1},
MIOpen(HIP):    yDesc = {32, 32, 30, 30}, {28800, 900, 30, 1}, packed,
MIOpen(HIP): }
MIOpen(HIP): Info [AmdRocmMetadataVersionDetect] ROCm MD version AMDHSA_COv3, HIP version 6.3.42134, MIOpen version 3.3.0.d22d5a13f-dirty
MIOpen(HIP): Info2 [GetWorkSpaceSize]
MIOpen(HIP): Info [GetSolutions]
MIOpen(HIP): Info [IsNetworkedFilesystem] Filesystem type at '"/home/neil//.config/miopen/"' is: 0xef53 'EXT2/3/4_SUPER_MAGIC'
MIOpen(HIP): Info2 [GetLibPath] Lib Path: "/opt/rocm/lib/libMIOpen.so.1.0"
MIOpen(HIP): Info2 [GetInstalledPathFile] inexact find database search
MIOpen(HIP): Info2 [GetInstalledPathFile] Iterating over find db directory "/opt/rocm/share/miopen/db"
MIOpen(HIP): Info [Measure] ReadonlyRamDb::Prefetch time: 5e-05 ms
MIOpen(HIP): Info [Prefetch] File is unreadable: "/home/neil//.config/miopen/gfx1102_16.HIP.3_3_0_d22d5a13f-dirty.ufdb.txt"
MIOpen(HIP): Info [Measure] RamDb::Prefetch time: 0.00856 ms
MIOpen(HIP): Info2 [FindRecordUnsafe] Looking for key 4-32-32-3x3-32-30-30-32-0x0-1x1-1x1-0-NCHW-FP32-F in cache for file "/home/neil//.config/miopen/gfx1102_16.HIP.3_3_0_d22d5a13f-dirty.ufdb.txt"
MIOpen(HIP): Info2 [FindRecord] Looking for key 4-32-32-3x3-32-30-30-32-0x0-1x1-1x1-0-NCHW-FP32-F in file ""
MIOpen(HIP): Info2 [Measure] Db::FindRecord time: 0.02485 ms
MIOpen Error: tensor_shape_variable needs to be an array
MIOpen(HIP): miopenStatus_t miopenFindConvolutionForwardAlgorithm(miopenHandle_t, const miopenTensorDescriptor_t, const void *, const miopenTensorDescriptor_t, const void *, const miopenConvolutionDescriptor_t, const miopenTensorDescriptor_t, void *, const int, int *, miopenConvAlgoPerf_t *, void *, size_t, bool){
MIOpen(HIP):    handle = stream: 0, device_id: 0
MIOpen(HIP):    xDesc = {32, 4, 32, 32}, {4096, 1024, 32, 1}, packed,
MIOpen(HIP):    x = 0x7e2e66801200
MIOpen(HIP):    wDesc = {32, 4, 3, 3}, {36, 9, 3, 1}, packed,
MIOpen(HIP):    w = 0x7e2e66800000
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {1, 1},
MIOpen(HIP):    yDesc = {32, 32, 30, 30}, {28800, 900, 30, 1}, packed,
MIOpen(HIP):    y = 0x7e2d5d800000
MIOpen(HIP):    requestAlgoCount = 1
MIOpen(HIP):    returnedAlgoCount = 32764
MIOpen(HIP):    perfResults =
MIOpen(HIP):    workSpace = nullptr
MIOpen(HIP):    workSpaceSize = 0
MIOpen(HIP):    exhaustiveSearch = 0
MIOpen(HIP): }
MIOpen(HIP): Info [FindConvFwdAlgorithm] requestAlgoCount = 1, workspace = 0
MIOpen(HIP): Info [GetSolutions]
MIOpen(HIP): Info2 [FindRecordUnsafe] Looking for key 4-32-32-3x3-32-30-30-32-0x0-1x1-1x1-0-NCHW-FP32-F in cache for file "/home/neil//.config/miopen/gfx1102_16.HIP.3_3_0_d22d5a13f-dirty.ufdb.txt"
MIOpen(HIP): Info2 [FindRecord] Looking for key 4-32-32-3x3-32-30-30-32-0x0-1x1-1x1-0-NCHW-FP32-F in file ""
MIOpen(HIP): Info2 [Measure] Db::FindRecord time: 0.025221 ms
MIOpen Error: tensor_shape_variable needs to be an array
MIOpen(HIP): miopenStatus_t miopenDestroyConvolutionDescriptor(miopenConvolutionDescriptor_t){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {1, 1},
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenDestroyTensorDescriptor(miopenTensorDescriptor_t){
MIOpen(HIP):    tensorDesc = {32, 4, 3, 3}, {36, 9, 3, 1}, packed,
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenDestroyTensorDescriptor(miopenTensorDescriptor_t){
MIOpen(HIP):    tensorDesc = {32, 32, 30, 30}, {28800, 900, 30, 1}, packed,
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenDestroyTensorDescriptor(miopenTensorDescriptor_t){
MIOpen(HIP):    tensorDesc = {32, 4, 32, 32}, {4096, 1024, 32, 1}, packed,
MIOpen(HIP): }
Traceback (most recent call last):
  File "/home/neil/trigger_error.py", line 7, in <module>
    result = F.conv2d(input, weight)
RuntimeError: miopenStatusUnknownError

heres the minimal program:

import torch
from torch.nn import functional as F

weight = torch.randn(32, 4, 3, 3).cuda()
input = torch.randn(32, 4, 32, 32).cuda()

result = F.conv2d(input, weight)
print(f"{result.shape=} {result.dtype=} {result.device=}")

I'll try parsing the findings of the OP and see if I can get it working. If I do I'll report back.

Mar 10 '25 01:03 ghost

I can repoduce this issue.

MIOpenDriver might be a more convenient way to reproduce this issue, as shown in #3597

Mar 11 '25 11:03 IMbackK

I can confirm this issue stems from frugally-deep 0.16+, building miopen against frugally-deep 0.15.20 avoids this issue.

Mar 11 '25 16:03 IMbackK

I can confirm that compiling with

    -D MIOPEN_ENABLE_AI_KERNEL_TUNING=Off
    -D MIOPEN_ENABLE_AI_IMMED_MODE_FALLBACK=Off

and local/frugally-deep 0.16.2-1 allows use of conv2d from pytorch.

I think the issue here is with arch and it updating past what MIOpen supports. If anybody on arch is reading this and they want a quick and dirty fix: 1 - clone https://gitlab.archlinux.org/archlinux/packaging/packages/miopen-hip 2 - open PKGBUILD and find

  local cmake_args=(
     -Wno-dev
    -G Ninja
    -B build
    -S "$pkgname"
    -D CMAKE_CXX_FLAGS="${CXXFLAGS} -fcf-protection=none -DNDEBUG"
    -D CMAKE_INSTALL_PREFIX=/opt/rocm
    -D CMAKE_BUILD_TYPE=None
    -D MIOPEN_BACKEND=HIP
    -D HALF_INCLUDE_DIR="$srcdir/deps/usr/include"
    -D CMAKE_PREFIX_PATH="$srcdir/deps/usr/lib/cmake"
  )
  cmake "${cmake_args[@]}"
  cmake --build build

change it to

  local cmake_args=(
     -Wno-dev
    -G Ninja
    -B build
    -S "$pkgname"
    -D CMAKE_CXX_FLAGS="${CXXFLAGS} -fcf-protection=none -DNDEBUG"
    -D CMAKE_INSTALL_PREFIX=/opt/rocm
    -D CMAKE_BUILD_TYPE=None
    -D MIOPEN_BACKEND=HIP
    -D HALF_INCLUDE_DIR="$srcdir/deps/usr/include"
    -D CMAKE_PREFIX_PATH="$srcdir/deps/usr/lib/cmake"
    -D MIOPEN_ENABLE_AI_KERNEL_TUNING=Off
    -D MIOPEN_ENABLE_AI_IMMED_MODE_FALLBACK=Off
  )
  cmake "${cmake_args[@]}"
  cmake --build build -j 4

3 - run "makepkg" in the directory with PKGBUILD (might have to pacman -S some build deps) 4 - go make some coffee while it compiles 5 - pacman -U miopen-hip-6.3.2-1-x86_64.pkg.tar.zst

Mar 13 '25 02:03 ghost

I'll second that @sakura-nyaa's patch does stop it from breaking, but it also just doesn't compile in the part of MIOpen that uses frugally-deep. I don't need it, but others may need to be more wary.

I was following along getting this branch/fork of ctranslate2 set up, and was seeing the same error.

Mar 13 '25 05:03 lcarsos

I've had success using this patched version of the model. This is not a permanent solution, it doesn't include other architectures (the other ones needing a patch are CDNA2/3, not desktop cards), but it works on my RX 7900 GRE system.

Mar 27 '25 04:03 MCJack123

When following the Introduction to Keras for engineers (full source code) I still get this error with ROCm 6.4.0 on Arch Linux:

Epoch 1/20
MIOpen Error: tensor_shape_variable needs to be an array
MIOpen Error: tensor_shape_variable needs to be an array
Traceback (most recent call last):
  File "/home/sepp/src/ml-keras-test/./ml-test.py", line 134, in <module>
    model.fit(
    ~~~~~~~~~^
        x_train,
        ^^^^^^^^
    ...<4 lines>...
        callbacks=callbacks,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib/python3.13/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
RuntimeError: Exception encountered when calling Conv2D.call().

miopenStatusUnknownError

Arguments received by Conv2D.call():
  • inputs=torch.Tensor(shape=torch.Size([128, 28, 28, 1]), dtype=float32)

@adityas-amd is there a specific reason why this was closed?

May 15 '25 07:05 sebastian-de

@stellaraccident @adityas-amd Pinning the dependency to a specific version in your builds should not be considered a fix for compatibility issue, it also dose absolutely nothing for distro packages.

May 15 '25 07:05 IMbackK

@stellaraccident @adityas-amd Pinning the dependency to a specific version in your builds should not be considered a fix for compatibility issue, it also dose absolutely nothing for distro packages.

I don't disagree. From before my time, but I don't see how we can ship software around this dep at all. Some people thought that using a third party library to interpret unstable tensorflow binaries was a thing to do. I think we need to either remove/replace the features that rely on this or completely vendor the library. In practice, there is no way to support what this is trying to do.

May 15 '25 07:05 stellaraccident

I fully agree that the way this selection logic is implemented is bad, however its quite useful. I think the right course of action is indeed to vendor the library to solve the immediate problem and to later replace the whole thing with something based on onnx or so.

May 15 '25 08:05 IMbackK

I fully agree that the way this selection logic is implemented is bad, however its quite useful. I think the right course of action is indeed to vendor the library to solve the immediate problem and to later replace the whole thing with something based on onnx or so.

Probably full vendoring if the feature is to be kept as is. Since this library is part of rocm core, we have to be careful about what it depends on. Particularly full runtimes like onnx typically depend on miopen for kernels.

May 15 '25 14:05 stellaraccident

I also experience this when trying to use whisper with pytorch-rocm

╭─oleg at oleg-pc in /mnt/HDD/Oleg/Documents/Whisper on main✘✘✘ 25-05-17 - 13:34:49
╰─⠠⠵ python whisper-test.py        
MIOpen(HIP): miopenStatus_t miopenCreateTensorDescriptor(miopenTensorDescriptor_t *){
MIOpen(HIP):    tensorDesc = 0x5f06e43cc0e0
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetTensorDescriptor(miopenTensorDescriptor_t, miopenDataType_t, int, const int *, const int *){
MIOpen(HIP):    tensorDesc = {}, {}, packed, 
MIOpen(HIP):    dataType = 0
MIOpen(HIP):    nbDims = 4
MIOpen(HIP):    dim.values = { 1 128 1 3000 }
MIOpen(HIP):    stride.values = { 384000 3000 3000 1 }
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenCreateTensorDescriptor(miopenTensorDescriptor_t *){
MIOpen(HIP):    tensorDesc = 0x7fff5ed095f8
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetTensorDescriptor(miopenTensorDescriptor_t, miopenDataType_t, int, const int *, const int *){
MIOpen(HIP):    tensorDesc = {}, {}, packed, 
MIOpen(HIP):    dataType = 0
MIOpen(HIP):    nbDims = 4
MIOpen(HIP):    dim.values = { 1280 128 1 3 }
MIOpen(HIP):    stride.values = { 384 3 3 1 }
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenCreateTensorDescriptor(miopenTensorDescriptor_t *){
MIOpen(HIP):    tensorDesc = 0x7fff5ed09608
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetTensorDescriptor(miopenTensorDescriptor_t, miopenDataType_t, int, const int *, const int *){
MIOpen(HIP):    tensorDesc = {}, {}, packed, 
MIOpen(HIP):    dataType = 0
MIOpen(HIP):    nbDims = 4
MIOpen(HIP):    dim.values = { 1 1280 1 3000 }
MIOpen(HIP):    stride.values = { 3840000 3000 3000 1 }
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenCreateConvolutionDescriptor(miopenConvolutionDescriptor_t *){
MIOpen(HIP):    convDesc = 0x7fff5ed09708
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenInitConvolutionNdDescriptor(miopenConvolutionDescriptor_t, int, const int *, const int *, const int *, miopenConvolutionMode_t){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 0}, {1, 1}, {1, 1}, 
MIOpen(HIP):    spatialDim = 2
MIOpen(HIP):    pads = { 0 1 }
MIOpen(HIP):    strides = { 1 1 }
MIOpen(HIP):    dilations = { 1 1 }
MIOpen(HIP):    c_mode = 0
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetConvolutionGroupCount(miopenConvolutionDescriptor_t, int){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 1}, {1, 1}, {1, 1}, 
MIOpen(HIP):    groupCount = 1
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenSetConvolutionAttribute(miopenConvolutionDescriptor_t, const miopenConvolutionAttrib_t, const int){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 1}, {1, 1}, {1, 1}, 
MIOpen(HIP):    attr = 1
MIOpen(HIP):    value = 0
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenConvolutionForwardGetWorkSpaceSize(miopenHandle_t, const miopenTensorDescriptor_t, const miopenTensorDescriptor_t, const miopenConvolutionDescriptor_t, const miopenTensorDescriptor_t, size_t *){
MIOpen(HIP):    handle = stream: 0, device_id: 0
MIOpen(HIP):    wDesc = {1280, 128, 1, 3}, {384, 3, 3, 1}, packed, 
MIOpen(HIP):    xDesc = {1, 128, 1, 3000}, {384000, 3000, 3000, 1}, packed, 
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 1}, {1, 1}, {1, 1}, 
MIOpen(HIP):    yDesc = {1, 1280, 1, 3000}, {3840000, 3000, 3000, 1}, packed, 
MIOpen(HIP): }
MIOpen Error: tensor_shape_variable needs to be an array
MIOpen(HIP): miopenStatus_t miopenFindConvolutionForwardAlgorithm(miopenHandle_t, const miopenTensorDescriptor_t, const void *, const miopenTensorDescriptor_t, const void *, const miopenConvolutionDescriptor_t, const miopenTensorDescriptor_t, void *, const int, int *, miopenConvAlgoPerf_t *, void *, size_t, bool){
MIOpen(HIP):    handle = stream: 0, device_id: 0
MIOpen(HIP):    xDesc = {1, 128, 1, 3000}, {384000, 3000, 3000, 1}, packed, 
MIOpen(HIP):    x = 0x7ec447f10400
MIOpen(HIP):    wDesc = {1280, 128, 1, 3}, {384, 3, 3, 1}, packed, 
MIOpen(HIP):    w = 0x7ec4c0600000
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 1}, {1, 1}, {1, 1}, 
MIOpen(HIP):    yDesc = {1, 1280, 1, 3000}, {3840000, 3000, 3000, 1}, packed, 
MIOpen(HIP):    y = 0x7ec44fa00000
MIOpen(HIP):    requestAlgoCount = 1
MIOpen(HIP):    returnedAlgoCount = 32767
MIOpen(HIP):    perfResults = 
MIOpen(HIP):    workSpace = nullptr
MIOpen(HIP):    workSpaceSize = 0
MIOpen(HIP):    exhaustiveSearch = 0
MIOpen(HIP): }
MIOpen Error: tensor_shape_variable needs to be an array
MIOpen(HIP): miopenStatus_t miopenDestroyConvolutionDescriptor(miopenConvolutionDescriptor_t){
MIOpen(HIP):    convDesc = conv2d, miopenConvolution, miopenPaddingDefault, {0, 1}, {1, 1}, {1, 1}, 
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenDestroyTensorDescriptor(miopenTensorDescriptor_t){
MIOpen(HIP):    tensorDesc = {1280, 128, 1, 3}, {384, 3, 3, 1}, packed, 
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenDestroyTensorDescriptor(miopenTensorDescriptor_t){
MIOpen(HIP):    tensorDesc = {1, 1280, 1, 3000}, {3840000, 3000, 3000, 1}, packed, 
MIOpen(HIP): }
MIOpen(HIP): miopenStatus_t miopenDestroyTensorDescriptor(miopenTensorDescriptor_t){
MIOpen(HIP):    tensorDesc = {1, 128, 1, 3000}, {384000, 3000, 3000, 1}, packed, 
MIOpen(HIP): }
Traceback (most recent call last):
  File "/mnt/HDD/Oleg/Documents/Whisper/whisper-test.py", line 5, in <module>
    result = model.transcribe("test.ogg",word_timestamps=True)
  File "/usr/lib/python3.13/site-packages/whisper/transcribe.py", line 146, in transcribe
    _, probs = model.detect_language(mel_segment)
               ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/lib/python3.13/site-packages/whisper/decoding.py", line 52, in detect_language
    mel = model.encoder(mel)
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/lib/python3.13/site-packages/whisper/model.py", line 193, in forward
    x = F.gelu(self.conv1(x))
               ~~~~~~~~~~^^^
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/lib/python3.13/site-packages/torch/nn/modules/conv.py", line 375, in forward

May 17 '25 10:05 RevengeRip

Hi, Arch Linux package maintainer here! I've rebuilt miopen with pinned frugally-deep dependency. Please test miopen-hip-6.4.1-2 and tell me if it fixes your issues.

Jun 27 '25 13:06 tpkessler

Hi, Arch Linux package maintainer here! I've rebuilt miopen with pinned frugally-deep dependency. Please test miopen-hip-6.4.1-2 and tell me if it fixes your issues.

Well, I don't get this particular error anymore, although whisper still doesn't work

Memory access fault by GPU node-1 (Agent handle: 0x555a3c54ef20) on address 0x7f702e223000. Reason: Page not present or supervisor privilege.
GPU core dump created: gpucore.23973
[1]    23973 IOT instruction (core dumped)  python whisper-test.py

Jun 27 '25 15:06 RevengeRip

Hi, Arch Linux package maintainer here! I've rebuilt miopen with pinned frugally-deep dependency. Please test miopen-hip-6.4.1-2 and tell me if it fixes your issues.

For me it's fixed - I can now use Keras without problems. Thanks for publishing the rebuilt package so fast!

Jun 27 '25 15:06 sebastian-de

Can confirm it's at least ~fixed~ a workaround for the conv2d problem.

Jul 02 '25 01:07 yuheho7749

Please do not call this fixed if you just installed the new arch linux package, the arch linux package now includes a workaround but this underlying issue is not fixed.

Jul 02 '25 06:07 IMbackK

This issue has been migrated to: https://github.com/ROCm/rocm-libraries/issues/889

Jul 28 '25 20:07 assistant-librarian[bot]

Imported to ROCm/rocm-libraries

Jul 28 '25 20:07 ammallya

MIOpen MIOpen copied to clipboard

frugally-deep 0.16.0 appears to break kernel/model files

MIOpen
MIOpen copied to clipboard