parsec icon indicating copy to clipboard operation
parsec copied to clipboard

PaRSEC memory advice preferred device and parsec_get_best_device failure

Open therault opened this issue 1 year ago • 0 comments

Describe the bug

If we do parsec_advise_data_on_device(dta, did, PARSEC_DEV_DATA_ADVICE_PREFERRED_DEVICE ) on a parsec_data_t *dta that belongs to a data collection (as returned by rank_of, and that data is read directly (direct memory access) by a task (A <- A(m, n) in JDF), then parsec_get_best_device() will ignore the preferred device for A because this_task->data[i].source_repo_entry is NULL for this data.

We used if( NULL == this_task->data[i].source_repo_entry ) continue; in parsec_get_best_device() to detect that this data input comes from NEW, and that doesn't work in all cases, as this test also succeeds if the data comes directly from the data collection and not from another task.

A workaround would be to have a READ_A() task.... But this is not satisfying either.

To Reproduce

Try https://github.com/ICLDisco/dplasma/pull/69 and run testing_dpotrf with one or two GPUs, you will see that between two runs the number of kernels executed on the GPUs changes, this is because the advice set in the warmup is ignored.

Expected behavior

parsec_get_best_device() should be able to differentiate between a NEW and a flow, and it should not ignore the preferred device for the flows.

therault avatar Mar 17 '23 21:03 therault