[Issue]: supermassive RAM usage with TensileCreateLibrary
Problem Description
When trying to build hipblaslt, it has to use Tensile, more precisely, TensileCreateLibrary.py
Even with 64 GB of RAM, Zswap enabled + 32GB of swap and limiting build to only ONE core, the script gets killed by the OOM killer!
I sincerely think there might be a way not too load all the data at once during whatever TensileCreateLibrary.
That or it is leaking memory, since its memory usage steadily increases all the way up!
Operating System
NixOS 25.05 (experimental branch)
CPU
AMD Ryzen 9 5950X
GPU
AMD Radeon RX 6700 XT + AMD Radeon RX Vega 64
ROCm Version
ROCm 6.3.3
ROCm Component
Tensile
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
Hi @aviallon. Internal ticket has been created to investigate your issue. Thanks!
I think this should be transferred to hipBLASLt issue as although they reuse the name Tensile it is a self contained version in hipblasLt.
@aviallon this is a known issue that we are working on. We will follow up over the next few weeks once we have the relevant changes in.
@aviallon we've had some changes that improve memory usage over the last month or so. All changes should be in develop.
This issue has been migrated to: https://github.com/ROCm/rocm-libraries/issues/316
Closing the issue in this repo. Please refer to the migrated issue for updates.