mesh issues

Splitting tokens when routing

2

Splitting tokens when routing

copybara-service[bot]

cla: no

MODE models with hetereogeneous expert width

1

MODE models with hetereogeneous expert width

copybara-service[bot]

cla: no

Option to use mtf.Print to log which tokens are sent to which experts when run on CPU.

copybara-service[bot]

cla: yes

Minor comment fix to refer to the correct argument name.

copybara-service[bot]

cla: yes

Fix some example code in readme for einsum operation

2

baragona

cla: yes

How to freeze embedding layers

Hi, I'm wondering how I might freeze token embedding layers in Unitransformer implementations. All references online seem to point to keras and not implementations with mesh. https://github.com/tensorflow/mesh/blob/52a2332c3bb0aa5898caba7efecc8cfa0486276e/mesh_tensorflow/transformer/transformer.py#L697 Thank you

lintangsutawika

Beam search

Hello everyone, I was wondering if we could add an option when getting the prediction such that instead of having only the most likely one among the explored beams, it...

antonio-mastropaolo

Output raw model outputs during eval

Currently, only the postprocessed model outputs are written out into a file suffixed with "predictions". This outputs an additional file suffixed with "outputs" that stores the raw model outputs, without...

craffel

cla: yes

Save scores lazily.

copybara-service[bot]

cla: yes

Ability to add Custom Tensorflow Hooks

Will there be any future plans to allow users to add Custom Tensorflow Hooks such as `tf.estimator.LoggingTensorHook` to enable custom functions during the training/eval loop such as passing back metrics...

trisongz

mesh
mesh copied to clipboard

Metadata

Splitting tokens when routing

MODE models with hetereogeneous expert width

Option to use mtf.Print to log which tokens are sent to which experts when run on CPU.

Minor comment fix to refer to the correct argument name.

Fix some example code in readme for einsum operation

How to freeze embedding layers

Beam search

Output raw model outputs during eval

Save scores lazily.

Ability to add Custom Tensorflow Hooks

← Metadata

Owner

Metadata

mesh mesh copied to clipboard

Metadata

← Metadata

Owner

Metadata

mesh
mesh copied to clipboard