How can I do physically mini-batch training with gluon?
Hi Mu, I read the tutorials, examples, and source code of gluon. It's very interesting and easy to use. But my question is how to do mini-batch with dynamic computation graphs such as TreeLSTM. I don't think the MXNet example tree_lstm really does mini-batch physically. The example calculates forward() and backward() for every instance, then it just update gradients to weight after a batch_size, but in the core engine, the real calculation with CPU/GPU is not parallelized by mini-batch. Do I understand it right? So does mxnet have a plan on how to batch dynamic computation graphs? Like TensorFlow Fold.
dynamic batching is not supported now. that is, we can not write a forward function which process a single example each time, then let the system to automatically batch multiple examples to improve the system performance.
it's highly doable with the mxnet engine. but we don't have a concrete plan to support this feature.
the reason is that we are not convinced by the research results yet.