HIP icon indicating copy to clipboard operation
HIP copied to clipboard

[Issue]: `hipMalloc()` allocations over 4GB redirected to shared GPU memory

Open 7shi opened this issue 1 year ago • 0 comments

Problem Description

While testing memory allocation using hipMalloc() in a HIP environment, I observed an unexpected behavior when attempting to allocate large chunks of memory.

Key Observations:

  1. Allocations up to 4GB appear to use GPU VRAM as expected.
  2. Attempts to allocate more than 4GB in a single hipMalloc() call result in the allocation being redirected to shared GPU memory instead of dedicated VRAM.

Implications:

This behavior could significantly impact performance for applications requiring large memory allocations, as shared GPU memory is typically slower than dedicated VRAM.

Operating System

Windows 11 Home (10.0.22631)

CPU

AMD Ryzen 5 5600X 6-Core Processor

GPU

AMD Radeon RX 7600 XT

ROCm Version

ROCm 6.1.0

ROCm Component

HIP

Steps to Reproduce

I used a C++ program that incrementally allocates memory using hipMalloc(), starting from 512MB and increasing by 512MB increments up to 4GB and beyond. The program monitors the success of each allocation and the type of memory used.

#include <hip/hip_runtime.h>
#include <iostream>
#include <vector>
#include <string>
#include <thread>

int main() {
    hipDevice_t device;
    hipDeviceProp_t props;
    hipGetDevice(&device);
    hipGetDeviceProperties(&props, device);

    uint64_t totalVRAM = static_cast<uint64_t>(props.totalGlobalMem);
    size_t totalVRAM_MB = static_cast<size_t>(totalVRAM >> 20);
    size_t maxAllocationMB = 4096;

    std::cout << "Total VRAM: " << totalVRAM_MB << "MB" << std::endl;
    std::cout << "Will try to allocate up to: " << maxAllocationMB << "MB" << std::endl;

    for (size_t i = 512; i <= maxAllocationMB; i += 512) {
        void* ptr;
        hipError_t status = hipMalloc(&ptr, i << 20);
        if (status != hipSuccess) {
            std::cout << "Allocation failed: " << hipGetErrorString(status) << std::endl;
            break;
        }
        std::cout << "Allocated " << i << "MB" << std::endl;
        if (i < maxAllocationMB) {
            std::this_thread::sleep_for(std::chrono::seconds(5));
        } else {
            // Pause before freeing memory
            std::cout << "Press Enter to free memory and exit..." << std::endl;
            std::string dummy;
            std::getline(std::cin, dummy);
        }
        hipFree(ptr);
    }
}

Result: image

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

hipconfig --full

HIP version  : 6.1.40252-53f3e11ac

== hipconfig
HIP_PATH     : C:/Program Files/AMD/ROCm/6.1/
ROCM_PATH    : /opt/rocm
HIP_COMPILER : clang
HIP_PLATFORM : amd
HIP_RUNTIME  : rocclr
CPP_CONFIG   :  -D__HIP_PLATFORM_HCC__= -D__HIP_PLATFORM_AMD__= -I"C:/Program Files/AMD/ROCm/6.1//include" -I"C:\Program Files\AMD\ROCm\6.1\lib\clang\19
"

== hip-clang
HIP_CLANG_PATH   : C:/Program Files/AMD/ROCm/6.1//bin
clang version 19.0.0git ([email protected]:Compute-Mirrors/llvm-project b3dbdf4f03718d63a3292f784216fddb3e73d521)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files\AMD\ROCm\6.1\\bin
AOMP-18.0-12 (http://github.com/ROCm-Developer-Tools/aomp):
 Source ID:18.0-12-ce1873ac686bb90ddec72bb99889a4e80e2de382
  LLVM version 19.0.0git
  Optimized build.
  Default target: x86_64-pc-windows-msvc
  Host CPU: znver3

  Registered Targets:
    amdgcn - AMD GCN GPUs
    r600   - AMD GPUs HD2XXX-HD6XXX
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64
hip-clang-cxxflags :  -O3 
hip-clang-ldflags  : --driver-mode=g++ -O3 -fuse-ld=lld --ld-path="C:\Program Files\AMD\ROCm\6.1\bin/lld-link.exe" -Llib --hip-link 

=== Environment Variables
PATH=C:\Program Files\AMD\ROCm\6.1\bin;C:\Program Files\ninja;C:\VulkanSDK\1.3.296.0\Bin;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files\Git\cmd;C:\Program Files\CMake\bin;C:\Program Files\dotnet\;C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\;C:\Strawberry\c\bin;C:\Strawberry\perl\site\bin;C:\Strawberry\perl\bin;C:\Users\7shi\.cargo\bin;C:\Users\7shi\AppData\Local\Microsoft\WindowsApps;;C:\Users\7shi\AppData\Local\Programs\Ollama;C:\Users\7shi\AppData\Local\Programs\Microsoft VS Code\bin;C:\Users\7shi\.dotnet\tools
HIPCC="C:\Program Files\AMD\ROCm\6.1\bin\hipcc"
HIPCONFIG="C:\Program Files\AMD\ROCm\6.1\bin\hipconfig"
HIP_PATH=C:\Program Files\AMD\ROCm\6.1\
HIP_PATH_61=C:\Program Files\AMD\ROCm\6.1\

== Windows Display Drivers
Hostname     : rx7600xt
Advanced Micro Devices, Inc.  C:\Windows\System32\DriverStore\FileRepository\u0407465.inf_amd64_25af0ae2eb807985\B407987\atidx9loader64.dll,C:\Windows\System32\DriverStore\FileRepository\u0407465.inf_amd64_25af0ae2eb807985\B407987\amdxx64.dll,C:\Windows\System32\DriverStore\FileRepository\u0407465.inf_amd64_25af0ae2eb807985\B407987\amdxx64.dll,C:\Windows\System32\DriverStore\FileRepository\u0407465.inf_amd64_25af0ae2eb807985\B407987\amdxc64.dll  AMD Radeon RX 7600 XT  

7shi avatar Oct 20 '24 17:10 7shi