Lu Lu
Lu Lu
There is another PR: https://github.com/lululxvi/deepxde/pull/703 for parallel via Horovod. Does Hovorod support paddle? If so, then we can make parallel working for all backends via Horovod.
Please resolve the conflict.
@pescap Check this.
> @lululxvi I have no idea why docs building failed https://readthedocs.org/projects/deepxde/builds/19911895/ Fixed.
The implementation here seems different from the docs https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/06_distributed_training/data_parallel/principle_and_demo_cn.html For example, the docs also modify the optimizer `fleet.distributed_optimizer(optimizer)`, etc.
> > The implementation here seems different from the docs https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/06_distributed_training/data_parallel/principle_and_demo_cn.html > > For example, the docs also modify the optimizer `fleet.distributed_optimizer(optimizer)`, etc. > > Done If `fleet.distributed_optimizer`, then why...
Another question: for different processors, the random seed is different, right? If so, then this also makes the number of training points in different GPUs different... You can check my...
> Seed in different processors should be same as I do not add rank in seed... Can you confirm this is true for both python random and numpy random?
Please check this https://github.com/lululxvi/deepxde/pull/1205#issuecomment-1517761124, which shows that different processors have different random seeds. Also try the following code: ``` import random from mpi4py import MPI comm = MPI.COMM_WORLD rank =...
> Maybe this way? for i in range(10): dWdri = dde.grad.jacobian(W, x, i=i, j=0) if i==0: dWdr = dWdri else: dWdr = torch.concat([dWdr,dWdri],axis=1) Yes, just use a for loop.