mesh icon indicating copy to clipboard operation
mesh copied to clipboard

Mesh TensorFlow: Model Parallelism Made Easier

Results 99 mesh issues
Sort by recently updated
recently updated
newest added

Splitting tokens when routing

cla: no

MODE models with hetereogeneous expert width

cla: no

Option to use mtf.Print to log which tokens are sent to which experts when run on CPU.

cla: yes

Minor comment fix to refer to the correct argument name.

cla: yes

Hi, I'm wondering how I might freeze token embedding layers in Unitransformer implementations. All references online seem to point to keras and not implementations with mesh. https://github.com/tensorflow/mesh/blob/52a2332c3bb0aa5898caba7efecc8cfa0486276e/mesh_tensorflow/transformer/transformer.py#L697 Thank you

Hello everyone, I was wondering if we could add an option when getting the prediction such that instead of having only the most likely one among the explored beams, it...

Currently, only the postprocessed model outputs are written out into a file suffixed with "predictions". This outputs an additional file suffixed with "outputs" that stores the raw model outputs, without...

cla: yes

Will there be any future plans to allow users to add Custom Tensorflow Hooks such as `tf.estimator.LoggingTensorHook` to enable custom functions during the training/eval loop such as passing back metrics...