Xin Yao
Xin Yao
I agree with you. @SSARCandy This is my implementation, any advice? ```python def coral_loss(source, target): d = source.size(1) ns, nt = source.size(0), target.size(0) # source covariance tmp_s = torch.ones((1, ns))...
@redhat12345 My code is based on `PyTorch>=0.4`, in which `torch.tensor` and `Variable` are merged together.
@redhat12345 if the `source` and `target` are cuda tensors, then `torch.ones((1, ns))` should be `torch.ones((1, ns)).cuda()`, as well as that of `nt`. I have tried with this loss and find...
> @yaox12 I saw you've closed PR4384. Is this still an issue or you will find other resolutions? Either Quan's suggestion or specifying `-DCUDA_ARCH_NAME=All` should work.
Cannot reproduce. Can you share more env information, e.g., OS, RAM, etc.?
Not yet. cc @chang-l @TristonC
How many GPUs did you use? Have you changed the args such as `--graph-device`, `--data-device`?
Can you try adding `--shm-size=64g` (large enough to store the whole graph) to your `docker run` command?
This line of code in dataloader will create a shared memory array for shuffling. https://github.com/dmlc/dgl/blob/5ba5106acab6a642e9b790e5331ee519112a5623/python/dgl/dataloading/dataloader.py#L146-L149 When `len(train_seeds)` > 8M, the shared tensor will run out of Docker's default shm size...
According to the comments, the shared tensor is used for `persistent_workers=True` (`or num_workers > 0` I think?). We can change the code to use shared tensors only when these conditions...