nv-dlasalle
nv-dlasalle
This appears to be fixed by https://github.com/dmlc/dgl/pull/3119, but: 1. the docs need to be updated, 2. and we should add a unit test to ensure batch information continues to be...
> > > Hi @nv-dlasalle, Was your test result that "Avg epoch time: 6.580887830257415" got on DGX-A100? This was on a DGX-V100, but this PR has changed quite a bit...
@mufeili This is proof-of-concept showing running a single mini-batch across multiple GPUs (and the advantages to doing so). A technical report still needs to be created. Before it could be...
@yaox12 Can you add cuda versions?
Is this issue present when `num_workers=0`?
@jermainewang HugeCTR doesn't currently expose this via pytorch--however I think this only uses a handful of CPP files from it, so alternatively we could include just the needed files in...
@yaox12 Do you know why pytorch 1.12.1 would cause this? This looks like an issue of a forked cuda context, not related to the TensorAdapter (which has issues when we...
I can reproduce the issue with pytorch 1.9.0 if I delete the tensoradapter shared library. I wonder if this is related to #4135?
`sample_blocks` is the generic method for any sampling that outputs MFGs (e.g. would also apply for RandomWalks). Could we generalize the `sample_neighbors` interface such that if `fanout` is a list...
How will we differentiate a `sample_blocks` for different sampling types?