DearPL

Results 4 comments of DearPL

> It doesn't result in a speedup, as `memcpy` is memory-bound. > > ``` > Test Size(B) Avg.Time(us) > gdr_copy_to_mapping 1 0.2038 > gdr_copy_to_mapping 2 0.1940 > gdr_copy_to_mapping 4 0.1860...

@drossetti BTW, is there any calculation formula, otherwise that would depend on experimental values on kinds of HW configuration

@drossetti that would be a big work and cpu's work frequency or workload also need to be considered in theory. Experiments show that cpu's work frequency is a key influence...

vmm cases are there in sanity test cases, you can run it if your cuda version > 11.0