io icon indicating copy to clipboard operation
io copied to clipboard

TPU support for tensorflow-io

Open pshiko opened this issue 5 years ago • 13 comments

When we create a BigQueryClient instance in the scope of TPUStrategy, i got following error.

NotFoundError: 'IO>BigQueryClient' is neither a type of a primitive operation nor a 
name of a function registered in binary running on n-c6d1a32d-w-0. Make sure 
the operation or function is registered in the binary running in this process.
module version
tensorflow 2.0
tensorflow_io 0.10.0

Following is the code for reproduction. https://gist.github.com/pshiko/1309ddac8b90f7449e4c22ad4d9f294a

Is this bug? or limitation of BigQueryClient? And, Is there any other way to read data directly from BigQuery?

If it's just a bug, I want to fix it, but I don't understand what caused this.

pshiko avatar Jan 04 '20 12:01 pshiko

This is limitation of TPU platform. Internally TPUs are interfacing with a dedicated VM that doesn't have TF.IO installed (mostly for security reasons). We are working to change that.

vlasenkoalexey avatar Jan 08 '20 20:01 vlasenkoalexey

OK! I understood! BigQuery integration is essential to learn huge size data with TPU. I look forward to solving this problem.

pshiko avatar Jan 09 '20 11:01 pshiko

is there any update on this issue? And also, where should i check for updates for this issue?

pshiko avatar Jan 24 '20 09:01 pshiko

I'll follow up on this and provide an update.

vlasenkoalexey avatar Jan 30 '20 23:01 vlasenkoalexey

Turns out that there is significant amount of work we need to do internally in order to support BigQuery reader for TPUs. At the same time TPU team is updating internal infrastructure to work with one VM, which should also make usage of external packages such as tensorflow-io possible. Current very rough estimate for that work end of Q3.

vlasenkoalexey avatar Mar 30 '20 22:03 vlasenkoalexey

is there any update on this issue?

pshiko avatar Jul 13 '21 21:07 pshiko

Google has recently announced Cloud TPU support where your code runs directly on a machine hosting TPUs: https://cloud.google.com/blog/products/compute/introducing-cloud-tpu-vms

Therefore BQ reader should work using this setup, feel free to try it out: https://cloud.google.com/tpu/docs/tensorflow-quickstart-tpu-vm

I didn't have a chance to try it myself.

vlasenkoalexey avatar Jul 14 '21 15:07 vlasenkoalexey

Oh! That is great! Thanks for sharing the information! Now I can do what I want to do!

pshiko avatar Jul 14 '21 21:07 pshiko

@vlasenkoalexey

I tried to install tensorflow-io on the TPU-VM, but I couldn't find the version of tensorflow-io that matched the installed tensorflow on the TPU-VM. Is there a version of tensorflow-io that matches the special version of tensorflow installed on the TPU-VM? (The special version of tensorflow exists in /usr/share/tpu/*.whl of the TPU-VM, but I don't know how it was built.

pshiko avatar Nov 28 '21 09:11 pshiko

You specify which TensorFlow version to use when you create a TPU VM using gcloud alpha compute tpus tpu-vm create version argument. Just find a corresponding version of TF.IO from this table https://github.com/tensorflow/io#tensorflow-version-compatibility and install it using pip install --no-deps tensorflow-io==0.xx.xx command.

I didn't have a chance to test it myself, hope it is binary compatible and should just work.

vlasenkoalexey avatar Nov 29 '21 06:11 vlasenkoalexey

Finally I've got to test it myself, unfortunately it doesn't quite work out of the box yet because tensorflow pre-installed on TPU VMs is built with -D_GLIBCXX_USE_CXX11_ABI=1 flag and not compatible with tensorflow-io. Filed a bug to get it fixed.

In a meantime I was able to make it working by rebuilding tensorflow with TPU support for "v2-nightly20210914" TPU software version.

Here is a whl file: https://drive.google.com/file/d/16fvQVZBkCvzCzV0Kayfs--cM8lnDBm2B/view?usp=sharing

This whl is only going to work on TPU version "v2-nightly20210914".

Since binary has to be also compatible with libtpu.so, I built it from tf2.6 branch. Before installing this whl, first uninstall tf-nightly and keras packages.

Then you can install tensorflow-io as: pip install --no-deps tensorflow-io==0.21.0

After compatibility bug is fixed, it should work out of the box.

vlasenkoalexey avatar Dec 08 '21 00:12 vlasenkoalexey

Thanks @vlasenkoalexey for the update and great work! 👍

yongtang avatar Dec 09 '21 06:12 yongtang

Is it possible to do this using the v2-alpha TPUs?

nikhilanayak avatar Dec 28 '21 03:12 nikhilanayak