HIP icon indicating copy to clipboard operation
HIP copied to clipboard

hipMallocManaged(hipMemAttachHost) yields hipErrorInvalidValue

Open BryantLam opened this issue 3 years ago • 5 comments

What is the functional and performance expectation from Managed Memory support?

ROCm 3.9. The following code uses hipMallocManaged(..., hipMemAttachHost) and fails at runtime with hipErrorInvalidValue.

(The default flag value of hipMemAttachGlobal appears to work.)

#include <cstdio>
#include <cstdlib>
#include <hip/hip_runtime.h>

#define CHECK(err) do { \
  hipError_t e = err; \
  if (e != hipSuccess) { \
    fprintf(stderr, #err " = %s\n", hipGetStringError(e)); \
    exit(EXIT_FAILURE); \
  } \
} while (0)

int main() {
  void *mem = NULL;
  size_t sz = 1024;

  CHECK(hipMallocManaged(&mem, sz, hipMemAttachHost));
  CHECK(hipFree(mem));

  return 0;
}

Environment

ROCm 3.9.0 HIP version: 3.9.20412-6d111f85

Behavior observed on gfx803 and gfx900 CentOS Linux 7.9

BryantLam avatar May 06 '21 00:05 BryantLam

The only flag supported for hipMallocManaged right now is hipMemAttachGlobal. @gandryey might have more information on this.

satyanveshd avatar May 06 '21 13:05 satyanveshd

[AMD Public Use]

The hipMallocManaged implementation is currently a dumb implementation since a proper implementation requires driver support. The driver support for this feature and a more full-fledged hipMallocManaged implementation is currently being tested and should be available in a next future ROCm release.

mangupta avatar May 06 '21 14:05 mangupta

Currently it's sysmem alloc for managed memory, hence the performance will be low. Proper support will be available for some asics(gfx803 and gfx900 aren't in the current list) in the upcoming releases, but performance may still require more tuning. Handling frequent GPU page faults aren't free. hipMemAttachHost should have the same behavior as hipMemAttachGlobal in the future, because for the implementation plans to have DevAttrConcurrentManagedAccess always true.

gandryey avatar May 06 '21 14:05 gandryey

Dumb question: what causes the low performance from using sysmem alloc? I think it's due to lack of zero-copy support?

Thanks for the update. What's the minimum asic version for proper support? (Or a rule of thumb for the preliminary list.)

BryantLam avatar May 06 '21 16:05 BryantLam

@BryantLam Sorry for the lack of response. Do you still need assistance with this ticket? Thanks!

ppanchad-amd avatar Apr 30 '24 18:04 ppanchad-amd

@BryantLam Closing issue. Please re-open if you still need assistance with this ticket. Thanks!

ppanchad-amd avatar May 29 '24 17:05 ppanchad-amd