text StateBasedSentenceBreaker Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string

StateBasedSentenceBreaker's break_sentences fails when run on a GPU with Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string. Which references a map_fn call here.

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on a mobile device: n/a
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.7.1
Python version: 3.8.10
Bazel version (if compiling from source): n/a
GCC/Compiler version (if compiling from source): n/a
CUDA/cuDNN version: 11.2 / 8.1
GPU model and memory: NVIDIA GeForce RTX 3090 - 24G & NVIDIA GeForce GTX 1080 8G
Exact command to reproduce:

import tensorflow as tf
import tensorflow_text as text


@tf.function(input_signature=[tf.TensorSpec(shape=(None,), dtype=tf.string, name='doc')])
def split_sentences(doc):
    splitter = text.StateBasedSentenceBreaker()
    return splitter.break_sentences(doc)

doc = tf.constant(["Hello this is sentence 1. This is sentence 2."])
print(split_sentences(doc))

Other info/logs: Traceback:

Traceback (most recent call last):
  File "<path>/.config/JetBrains/PyCharm2021.1/scratches/scratch_30.py", line 11, in <module>
    print(split_sentences(doc))
  File "<path>/venv/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "<path>/venv/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 58, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) INVALID_ARGUMENT:  During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
	 [[node map/while/TensorArrayV2Write/TensorListSetItem
 (defined at <path>/venv/lib/python3.8/site-packages/tensorflow_text/python/ops/state_based_sentence_breaker_op.py:120)
]]
	 [[Assert_2/AssertGuard/pivot_f/_99/_355]]
  (1) INVALID_ARGUMENT:  During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
	 [[node map/while/TensorArrayV2Write/TensorListSetItem
 (defined at <path>/venv/lib/python3.8/site-packages/tensorflow_text/python/ops/state_based_sentence_breaker_op.py:120)
]]
0 successful operations.
0 derived errors ignored. [Op:__inference_split_sentences_909]

Errors may have originated from an input operation.
Input Source operations connected to node map/while/TensorArrayV2Write/TensorListSetItem:
In[0] map/while/Placeholder_1:	
In[1] map/while/Placeholder:	
In[2] map/while/RaggedTensorToVariant:

Operation defined at: (most recent call last)
>>>   File "<path>/.config/JetBrains/PyCharm2021.1/scratches/scratch_30.py", line 11, in <module>
>>>     print(split_sentences(doc))
>>> 
>>>   File "<path>/.config/JetBrains/PyCharm2021.1/scratches/scratch_30.py", line 8, in split_sentences
>>>     return splitter.break_sentences(doc)
>>> 
>>>   File "<path>/venv/lib/python3.8/site-packages/tensorflow_text/python/ops/state_based_sentence_breaker_op.py", line 61, in break_sentences
>>>     results, _, _ = self.break_sentences_with_offsets(doc)
>>> 
>>>   File "<path>/venv/lib/python3.8/site-packages/tensorflow_text/python/ops/state_based_sentence_breaker_op.py", line 120, in break_sentences_with_offsets
>>>     fragment_text = map_fn.map_fn(
>>> 

Input Source operations connected to node map/while/TensorArrayV2Write/TensorListSetItem:
In[0] map/while/Placeholder_1:	
In[1] map/while/Placeholder:	
In[2] map/while/RaggedTensorToVariant:

Operation defined at: (most recent call last)
>>>   File "<path>/.config/JetBrains/PyCharm2021.1/scratches/scratch_30.py", line 11, in <module>
>>>     print(split_sentences(doc))
>>> 
>>>   File "<path>/.config/JetBrains/PyCharm2021.1/scratches/scratch_30.py", line 8, in split_sentences
>>>     return splitter.break_sentences(doc)
>>> 
>>>   File "<path>/venv/lib/python3.8/site-packages/tensorflow_text/python/ops/state_based_sentence_breaker_op.py", line 61, in break_sentences
>>>     results, _, _ = self.break_sentences_with_offsets(doc)
>>> 
>>>   File "<path>/venv/lib/python3.8/site-packages/tensorflow_text/python/ops/state_based_sentence_breaker_op.py", line 120, in break_sentences_with_offsets
>>>     fragment_text = map_fn.map_fn(
>>> 

Function call stack:
split_sentences -> map_while_body_853 -> split_sentences -> map_while_body_853

Jul 21 '22 21:07 HarryDunham

Similar issues include: https://github.com/tensorflow/tensorflow/issues/47325 https://github.com/TensorSpeech/TensorFlowASR/issues/71

Jul 21 '22 21:07 HarryDunham

I can also confirm

with tf.device("/CPU:0"):
    doc = tf.constant(["Hello this is sentence 1. This is sentence 2."])
    print(split_sentences(doc))

yields

<tf.RaggedTensor [[b'Hello this is sentence 1.', b'This is sentence 2.']]>

without issue. However, this does not address the desire to have this run on a GPU.

Jul 22 '22 12:07 HarryDunham

All of the string ops run on CPU. GPUs & TPUs are not set up to handle strings. TF should be running these preprocessing ops on the CPU then running the main model ops using the results on the GPU.

Jul 26 '22 03:07 broken

@broken Thank you for the insight. Does

TF should be running

imply that the onus is on tensorflow to avoid these types of errors rather than the user? I.e. the example from above about manually specifying device execution on the CPU shouldn't be required by the user, and instead "automagically" handled behind the scenes within TF?

Aug 02 '22 19:08 HarryDunham

Yes. If TF cannot find a GPU kernel for a particular op, it will fall back to the CPU.

I just ran your sample code above with GPU available and didn't run into the same error. Are you trying to force TF to use the GPU in some way?

Aug 04 '22 05:08 broken