mesh issues

mixed precision support on GPUs

Hi, To speed up training on V100 GPUs, I'd like to run mesh tf using mixed precision. While TensorFlow has an easy to use [automatic mixed precision](https://www.tensorflow.org/api_docs/python/tf/train/experimental/enable_mixed_precision_graph_rewrite) feature, it requires...

LiweiPeng

Capture performance profile using Tensorboard

I would like to debug training/fine-tuning performance of mesh transformer on CPU/GPU. Is it possible to capture performance profile using Tensorboard? If so, is there an example or tutorial that...

mcompute

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process

When I was running the `mnist.py`, it occurred that in `mnist_dataset.py`, function `download`, `os.remove(zipped_filepath)` couldn't work due to PermissionError. Therefore, changing this code into this might works. ` try: os.remove(zipped_filepath)...

samaritanhu

SelfAttention & EncDecAttention in mesh transformer allow different values for query, key, value

This paper [Low-Rank Bottleneck in Multi-head Attention Models](https://arxiv.org/pdf/2002.07028.pdf) suggests that we could fix the head size and keep hidden size unchanged. Could you support setting `d_k`, `d_q`, `d_v` independently instead...

desperadoola

Could you please set to False the default value of ignore_comments?

Could you please set to `False` the default value of `ignore_comments`? https://github.com/tensorflow/mesh/blob/7de6e9bc9e362d082b0d8e4b04be321a25b6f0a6/mesh_tensorflow/transformer/utils.py#L766 I'm using T5 and it took me a while to find out why some of the lines in...

rodrigo-eai

jinoobaek-qz

Regarding data and model parallelism of mnist python code in examples

I have made changes to the mnist.py in the examples section, as documented in the GitHub I have made the changes to achieve data parallelism and model parallelism. I have...

Raviteja1996

mesh
mesh copied to clipboard

Metadata

mixed precision support on GPUs

Capture performance profile using Tensorboard

PermissionError: [WinError 32] The process cannot access the file because it is being used by another process

SelfAttention & EncDecAttention in mesh transformer allow different values for query, key, value

Could you please set to False the default value of ignore_comments?

Question on params['context']

Non autoregressive Predict and Evaluate doesn’t Work

Distributed Mesh-TF

Fix running transformer in TPU v2-8

Regarding data and model parallelism of mnist python code in examples

← Metadata

Owner

Metadata

mesh mesh copied to clipboard

Metadata

← Metadata

Owner

Metadata

mesh
mesh copied to clipboard