mesh icon indicating copy to clipboard operation
mesh copied to clipboard

Mesh TensorFlow: Model Parallelism Made Easier

Results 99 mesh issues
Sort by recently updated
recently updated
newest added

Thank you for your great work, Here I'm curious about MOE-Transformer's static graph construction. > Q: When there is 1024 experts, switch gating method is used, you need to build...

Hi, I want to use 'L-BFGS' optimizer available in tf.contrib.opt.ScipyOptimizerInterface (or tfp.optimizer.lbfgs_minimize) with Mesh Tensorflow. Is there any direct way I can use it ?

Let's say I have following mesh structure. flags.DEFINE_string('mesh_shape', 'rows:3, columns:4', 'mesh shape') flags.DEFINE_integer('image_nx_block', 3, 'The number of x blocks.') flags.DEFINE_integer('image_ny_block', 4, 'The number of y blocks.') flags.DEFINE_string('layout', 'image_nx_block:rows, image_ny_block:columns', 'layout...

Add original AI2 version of c4 v3.0.1, ND3 deduplicated with param = 0.8, and LM1B, Wiki40B, and lm_first_len512 versions of original AI2 C4 and ND3 deduped AI2 C4 for evaluation.

cla: yes

My objective is to take a generic NN architecture and feed it to Mesh. Since the Mesh API has support for lowering the graph to TensorFlow by using mtf.lowering, I...

Add `cast` preprocessor and add tasks for inference prompts for deduplication project.

cla: yes

We tried to run Mesh-TensorFlow to train T5 on GPUs following the instructions on T5's repository, but the training is extremely slow. > global_step/sec: 0.0467347 > examples/sec: 0.186939 The training...

The following two comments seem wrong, they need to be switched. https://github.com/tensorflow/mesh/blob/4e07d5e7186626dbc56f5a6d63c5dc259f9eb9d8/mesh_tensorflow/transformer/moe.py#L423 https://github.com/tensorflow/mesh/blob/4e07d5e7186626dbc56f5a6d63c5dc259f9eb9d8/mesh_tensorflow/transformer/moe.py#L434

Allow init_from_checkpoint to accept a list of pairs, so as to enable initialization of multiple variables in the graph from the same variable in the checkpoint.

cla: yes

Allow for disabling the automatic save on shutdown. This is bad for mesh-tensorflow, where the variables to be saved haven't gotten updated since the previous checkpoint was written.

cla: yes