[arcane, accelerator] Use `malloc` for allocating nvidia GPU memory
To exploit ATS on Nvidia Grace-Hopper or HMM-enabled computers.
This PR adds a new ARCANE_CUDA_ALLOC_ATS cmake option, that is disabled by default.
We kept all the prefetch machinery, but most of it should be removed as there is no any more data "migration."
We do not use cudaMemAdvise_v2 to choose memory placement, letting the system do it. If page migration is not enabled, it can lead to performance penalties as the data location is chosen through a first-touch policy.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 69.66%. Comparing base (
a9a09e3) to head (0fa2411). Report is 5 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #1576 +/- ##
==========================================
- Coverage 69.66% 69.66% -0.01%
==========================================
Files 2247 2247
Lines 160512 160512
Branches 18493 18493
==========================================
- Hits 111825 111817 -8
- Misses 42021 42027 +6
- Partials 6666 6668 +2
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thansk for the PR.
I think it is better to use an environment variable to activate this allocator instead of compilation flag. This will allow us to dynamically change the allocator without recompiling (see class CommonUnifiedMemoryAllocatorWrapper in this file as an example).