supervised-reptile
supervised-reptile copied to clipboard
Parallel version
Any plans to release a multi-GPU version of this? It looks like we should be able to run the meta_batch_size
iterations of the outer loop in reptile.train_step
in parallel on separate GPUs.
(I may have a shot at implementing it if there are no plans ATM, and if you think it'd give a nontrivial speedup.)
~ Ben
Reptile is definitely simple to scale across multiple machines, since each machine just has to run a separate inner loop and then average the parameters at the end. One thing about this implementation that makes it a bit tricky to parallelize is that Adam's parameters are updated sequentially for each task in the meta-batch. That is almost certainly unimportant, though, and there's other reasonable ways to do it.