Yanping Huang

Results 9 comments of Yanping Huang

We currently don't have anybody working on this. It would be great if you could help us by working on this and submitting a PR. Let us know if you...

No, it's not supported nor tested. To enable fp16, set p.fprop_dtype = tf.float16 in train(cls) and task(cls) under lingvo/tasks/lm/params/one_billion_wds.py. And you may also need to convert input paddings to tf.float16...

Sorry for the delay reply. I was away from vacation. Thank you very much for your interests in GPipe. 1. Re-compute is implemented in just one line. https://github.com/tensorflow/lingvo/blob/46324be3ac7faa12663337624326238e65a2e57c/lingvo/core/recurrent.py#L947 2. We...

Our code has been open sourced. The instructions to run gshard dense transformer on gcp tpus are described here: https://github.com/tensorflow/lingvo/tree/master/lingvo/tasks/lm

We will update a better instruction to run GPipe in the near future. An example to run GPipe is provided at the comments here: https://github.com/tensorflow/lingvo/blob/master/lingvo/tasks/lm/params/one_billion_wds.py#L180. Once you modified OneBWdsGPipeTransformer hparams,...

Is it still an open issue?

We haven't tested it on GPUs yet. It depends on the XLA support for GPUs.

(1) The cost of each layer is estimated by the FPropMeta function. See examples in core/layers.py. Example: https://github.com/tensorflow/lingvo/blob/464c4386a05d108056becb106b2b827df968b615/lingvo/core/layers.py#L2772 (2) That was to create local copies on the GPUS in the...

Please provide more details for your problem description. Maybe a figure?