custom-op icon indicating copy to clipboard operation
custom-op copied to clipboard

Custom Op TPU

Open bhack opened this issue 4 years ago • 13 comments

Can we add in the example something related to TPU. There was a FAQ about creating custom ops for TPU https://cloud.google.com/tpu/docs/faq

bhack avatar Apr 04 '20 11:04 bhack

Had a brief chat with @frankchn and sounds like custom ops are not supported on TPU yet. See https://cloud.google.com/tpu/docs/tpus#when_to_use_tpus

''' Cloud TPUs are not suited to the following workloads: ... Neural network workloads that contain custom TensorFlow operations written in C++. Specifically, custom operations in the body of the main training loop are not suitable for TPUs. '''

cc: @frankchn @jhseu

yifeif avatar Apr 21 '20 01:04 yifeif

In the TPU FAQ I see

How can I write a custom op for Compute Engine? TensorFlow ops that run on Compute Engine are implemented in XLA HLO, a language for defining high-level tensor ops using a small set of low-level functions. XLA is included in TensorFlow's open source release, so it is technically possible to write your op in HLO. The majority of existing implementations can be found in the tf2xla directory. XLA only allows for execution of a limited set of tensor ops on the TPU, not arbitrary C++ or Python code. Most common tensor ops that can be implemented in HLO have already been written.

bhack avatar Apr 21 '20 09:04 bhack

While it is technically possible to write an XLA HLO op and get it to run on TPUs, we currently don't expose any way to load arbitrary user written HLO ops onto the TPU system itself. This may change in future releases, but we don't have anything to announce today.

frankchn avatar Apr 21 '20 10:04 frankchn

@frankchn Ok. can you reach anyone internally to fix the FAQ? Cause with that text seems that currently there is "an undocumented" path to build that custom ops.

bhack avatar Apr 21 '20 10:04 bhack

Yup, working on it. Thanks for bringing that to our attention!

frankchn avatar Apr 21 '20 13:04 frankchn

Thanks

bhack avatar Apr 21 '20 13:04 bhack

Is there a way to specify a fall-back option for TPUs? I have an optimized custom op for CPUs and GPUs, but I want my custom op to be able to run, even in-efficiently, on TPUs. The op can be specified with standard TF operations, so I'm just looking for a way to register a python function as the TPU implementation of the op. Is this possible?

orsharir avatar Aug 31 '20 19:08 orsharir

@orsharir Would it be possible to just encapsulate the op you want in a Python function, and then just switch between your custom op implementation and the TF default op implementation using flags?

def my_custom_op(input1, input2):
  if os.environ['use_tpu']:
    return tf.add(input1, input2)
  else:
    return custom_op(input1, input2)

frankchn avatar Aug 31 '20 22:08 frankchn

@frankchn

we currently don't expose any way to load arbitrary user written HLO ops onto the TPU system itself.

This sounds to me like you could load a custom XLA HLO op on TPU with out modifying the tpulib on the TPU system. Let's assume that I have SSH on to the TPU system it self do to the JAX TPU beta, can I than write a custom op?

XMaster96 avatar Apr 07 '21 08:04 XMaster96

@XMaster96 Unfortunately not because the underlying TPU ISA and associated tools for you to be able to write a XLA op isn't exposed even with the TPU VM preview.

frankchn avatar Apr 07 '21 17:04 frankchn

@frankchn ok, thanks So I was thinking a bit, I can't write a custom op myself, but can I at least load a custom op, some one else has written? And do I even need the VM preview to do so? I am really not sure but I could be able to request a custom op.

XMaster96 avatar Apr 17 '21 10:04 XMaster96

You can load custom ops that someone else (or you) has written so long as they are CPU custom ops (or build them into a custom TF build). I don't think anyone outside of Google can write XLA custom ops that run on the TPU.

frankchn avatar Apr 17 '21 17:04 frankchn

@bhack Since JAX already has Pallas to write the TPU kernel, is there any plan for TensorFlow for a similar feature?

edwardyehuang avatar Jan 14 '24 08:01 edwardyehuang