HIP icon indicating copy to clipboard operation
HIP copied to clipboard

[Issue]: ROCm 6.4.0 `hipMemRelease` not releasing physical memory

Open kliuae opened this issue 8 months ago • 3 comments

Problem Description

In ROCm 6.4.0, calling hipMemRelease does not appear to release the physical memory allocated on the GPU. Both hipMemGetInfo and rocm-smi report that the memory is still held by the process after the release call.

Using the base Docker image rocm/dev-ubuntu-22.04:6.4-complete, the following test script adapted from the official HIP virtual memory example demonstrates the issue:

#include <hip/hip_runtime.h>
#include <iostream>

#define ROUND_UP(SIZE,GRANULARITY) ((1 + SIZE / GRANULARITY) * GRANULARITY)

#define HIP_CHECK(expression)              \
{                                          \
    const hipError_t err = expression;     \
    if(err != hipSuccess){                 \
        std::cerr << "HIP error: "         \
            << hipGetErrorString(err)      \
            << " at " << __LINE__ << "\n"; \
    }                                      \
}

void stall(unsigned long long target) {
    volatile unsigned long long count = 0;
    for (unsigned long long i = 0; i < target; ++i) {
        count += i;
        count %= 100000;
    }
}

void print_gpu_memory_usage() {
    size_t free_mem = 0, total_mem = 0;
    HIP_CHECK(hipMemGetInfo(&free_mem, &total_mem));
    std::cout << "GPU memory usage -- Total: " 
              << total_mem / (1024*1024) << "MB, Used: "
              << (total_mem - free_mem) / (1024*1024) << "MB, Free: " 
              << free_mem / (1024*1024) << "MB" << std::endl;
}

int main() {

    int currentDev = 0;

    // Step 1: Check virtual memory management support on device 0
    int vmm = 0;
    HIP_CHECK(
        hipDeviceGetAttribute(
            &vmm, hipDeviceAttributeVirtualMemoryManagementSupported, currentDev
        )
    );

    std::cout << "Virtual memory management support value: " << vmm << std::endl;

    if (vmm == 0) {
        std::cout << "GPU 0 doesn't support virtual memory management.";
        return 0;
    }

    // Size of memory to allocate
    unsigned long long size = 10000000000;

    // Step 2: Allocate physical memory
    hipMemGenericAllocationHandle_t allocHandle;
    hipMemAllocationProp prop = {};
    prop.type = hipMemAllocationTypePinned;
    prop.location.type = hipMemLocationTypeDevice;
    prop.location.id = currentDev;
    size_t granularity = 0;
    HIP_CHECK(
        hipMemGetAllocationGranularity(
            &granularity,
            &prop,
            hipMemAllocationGranularityMinimum));
    unsigned long long padded_size = ROUND_UP(size, granularity);
    HIP_CHECK(hipMemCreate(&allocHandle, padded_size, &prop, 0));

    // Step 3: Reserve a virtual memory address range
    void* virtualPointer = nullptr;
    HIP_CHECK(hipMemAddressReserve(&virtualPointer, padded_size, granularity, nullptr, 0));

    // Step 4: Map the physical memory to the virtual address range
    HIP_CHECK(hipMemMap(virtualPointer, padded_size, 0, allocHandle, 0));

    // Step 5: Set memory access permission for pointer
    hipMemAccessDesc accessDesc = {};
    accessDesc.location.type = hipMemLocationTypeDevice;
    accessDesc.location.id = currentDev;
    accessDesc.flags = hipMemAccessFlagsProtReadWrite;

    HIP_CHECK(hipMemSetAccess(virtualPointer, padded_size, &accessDesc, 1));

    // Step 6: Perform memory operation
    int value = 42;
    HIP_CHECK(hipMemcpy(virtualPointer, &value, sizeof(int), hipMemcpyHostToDevice));

    int result = 1;
    HIP_CHECK(hipMemcpy(&result, virtualPointer, sizeof(int), hipMemcpyDeviceToHost));

    std::cout << "Allocated" << std::endl;
    print_gpu_memory_usage();

    // Step 8: Cleanup
    std::cout << "Unmapping..." << std::endl;
    HIP_CHECK(hipMemUnmap(virtualPointer, padded_size));
    print_gpu_memory_usage();
    // stall(1000000000ULL); // Uncomment to add buffer to observe memory usage in rocm-smi
    std::cout << "Releasing..." << std::endl;
    HIP_CHECK(hipMemRelease(allocHandle));
    print_gpu_memory_usage();
    // stall(1000000000ULL); // Uncomment to add buffer to observe memory usage in rocm-smi
    std::cout << "Freeing..." << std::endl;
    HIP_CHECK(hipMemAddressFree(virtualPointer, padded_size));
    print_gpu_memory_usage();
    // stall(1000000000ULL); // Uncomment to add buffer to observe memory usage in rocm-smi

    return 0;
}

Output on ROCm 6.4.0 (with issue):

Virtual memory management support value: 1
Allocated                                                                                             
GPU memory usage -- Total: 196592MB, Used: 11208MB, Free: 185384MB
Unmapping...
GPU memory usage -- Total: 196592MB, Used: 11208MB, Free: 185384MB
Releasing...
GPU memory usage -- Total: 196592MB, Used: 11208MB, Free: 185384MB
Freeing...
GPU memory usage -- Total: 196592MB, Used: 11208MB, Free: 185384MB

The behavior differs from the older versions as they release the memory after the hipMemRelease call.

Expected behavior (observed on ROCm 6.3.1 and 6.3.4):

Virtual memory management support value: 1
Allocated
GPU memory usage -- Total: 196592MB, Used: 10220MB, Free: 186372MB
Unmapping...
GPU memory usage -- Total: 196592MB, Used: 10220MB, Free: 186372MB
Releasing...
GPU memory usage -- Total: 196592MB, Used: 682MB, Free: 195910MB
Freeing...
GPU memory usage -- Total: 196592MB, Used: 682MB, Free: 195910MB

Operating System

Ubuntu 22.04.5 LTS (Jammy Jellyfish)

CPU

AMD EPYC 9654 96-Core Processor

GPU

AMD Instinct MI300X

ROCm Version

ROCm 6.4.0

ROCm Component

No response

Steps to Reproduce

  1. Launch the docker image
docker run -it \
   --network=host \
   --group-add=video \
   --ipc=host \
   --cap-add=SYS_PTRACE \
   --security-opt seccomp=unconfined \
   --device /dev/kfd \
   --device /dev/dri \
   rocm/dev-ubuntu-22.04:6.4-complete \
   bash
  1. Create a file example.cpp with the sample code above, and compile it with
hipcc example.cpp -o example
  1. Run with
./example

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

kliuae avatar Apr 25 '25 09:04 kliuae

Hi @kliuae. Internal ticket has been created to investigate this issue. Thanks!

ppanchad-amd avatar Apr 25 '25 14:04 ppanchad-amd

@kliuae on my side, rocm 6.4-complete. I think I can reproduce the error.

 $ ./example 
Virtual memory management support value: 1
Allocated
GPU memory usage -- Total: 196048MB, Used: 73012MB, Free: 123036MB
Unmapping...
GPU memory usage -- Total: 196048MB, Used: 73012MB, Free: 123036MB
Releasing...
GPU memory usage -- Total: 196048MB, Used: 73012MB, Free: 123036MB
Freeing...
GPU memory usage -- Total: 196048MB, Used: 73012MB, Free: 123036MB

YangWang92 avatar Apr 28 '25 06:04 YangWang92

Hi @kliuae. Tested this in the CLR amd-staging branch and is working properly. Can you please confirm? Thanks!

ppanchad-amd avatar Jun 10 '25 19:06 ppanchad-amd

Hi @ppanchad-amd, the recent ROCm 6.4.2 release fixed this issue - confirmed working in docker containers. Thanks for your help.

kliuae avatar Jul 22 '25 10:07 kliuae