deep-learning-with-python-notebooks
deep-learning-with-python-notebooks copied to clipboard
Chapter 11, Part 1: TextVectorization with output_mode="tf_idf"
The 24th code cell raises the following error when it runs on a PC using Anaconda(Python 3.8.5 + Tensorflow 2.6 or 2.7) while it runs well on Google Colab.
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-25-6747a8415a37> in <module>
----> 1 text_vectorization.adapt(text_only_train_ds)
2
3 tfidf_2gram_train_ds = train_ds.map(lambda x, y: (text_vectorization(x), y))
4 tfidf_2gram_val_ds = val_ds.map(lambda x, y: (text_vectorization(x), y))
5 tfidf_2gram_test_ds = test_ds.map(lambda x, y: (text_vectorization(x), y))
~\anaconda3\lib\site-packages\keras\engine\base_preprocessing_layer.py in adapt(self, data, batch_size, steps)
242 with data_handler.catch_stop_iteration():
243 for _ in data_handler.steps():
--> 244 self._adapt_function(iterator)
245 if data_handler.should_sync:
246 context.async_wait()
...
When the output_mode is not "tf_idf", then everything is ok. The error occurred first with TF 2.6, but it continues to happen even with TF 2.7.
text_vectorization = TextVectorization(
ngrams=2,
max_tokens=20000,
output_mode="tf_idf",
)
The python environment is as follows:
OS: Windows 11 (no WSL2) Anaconda + TF 2.6 or 2.7
I am wondering WHY!
Many thanks in advance for any help.
I was wrong with Google Colab. The same error occurs even on Google Colab. (I don't know why it worked yesterday or maybe I believed so at least.)
I think I found the main reason for the confusion.
Whether the error occurs depends on the GPU support.
- with GPU support: Yes, the error occurs.
- without GPU support: No error.
Hello, I've observed the same error on GPU only: in cell with code from Listing 11.11 (training new model with TF-IDF bigram model)
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No, using code from Deep Learning with Python, 2nd edition, Listing 11.11 Training and testing the TF-IDF bigram model
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Linux Ubuntu 20.04, release='5.4.0-90-generic', version='#101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021', machine='x86_64', processor='x86_64'
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: NA
- TensorFlow installed from (source or binary): binary: https://conda.anaconda.org/conda-forge/linux-64/tensorflow-base-2.6.2-cuda112py37h8d33417_2.tar.bz2 https://conda.anaconda.org/conda-forge/linux-64/tensorflow-estimator-2.6.2-cuda112py37h474db6c_2.tar.bz2 https://conda.anaconda.org/conda-forge/linux-64/tensorflow-2.6.2-cuda112py37h474db6c_2.tar.bz2 https://conda.anaconda.org/conda-forge/linux-64/tensorflow-gpu-2.6.2-cuda112py37h0bbbad9_2.tar.bz2
- TensorFlow version (use command below): unknown 2.6.2
- Python version: Python 3.7.12
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: 11.2/8201
- GPU model and memory: GeForce GTX 1660 SUPER, 5944MiB
You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with:
- TF 1.0:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
- TF 2.0:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior Error message:
/tmp/ipykernel_901495/519244469.py in <module>
----> 1 text_vectorization.adapt(text_only_train_ds)
2
3 tfidf_2gram_train_ds = train_ds.map(
4 lambda x, y: (text_vectorization(x), y),
5 num_parallel_calls=4)
see full log below
Describe the expected behavior The code should run without errors.
- Do you want to contribute a PR? (yes/no): no
- Briefly describe your candidate solution(if contributing):
Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/Jupyter/any notebook. Original code from repo
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
- GPU run log: chapter11_part01_introduction_reprex.md
- CPU run produces no error
I just ran into similar issues running on Google Colab with a GPU.
Stacktrace:
/usr/local/lib/python3.7/dist-packages/keras/engine/base_preprocessing_layer.py in adapt(self, data, batch_size, steps)
242 with data_handler.catch_stop_iteration():
243 for _ in data_handler.steps():
--> 244 self._adapt_function(iterator)
245 if data_handler.should_sync:
246 context.async_wait()
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.__traceback__)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb
/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
57 ctx.ensure_initialized()
58 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 59 inputs, attrs, num_outputs)
60 except core._NotOkStatusException as e:
61 if name is not None:
InvalidArgumentError: 2 root error(s) found.
(0) INVALID_ARGUMENT: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
[[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]
[[Func/map/while/body/_1/input/_50/_58]]
(1) INVALID_ARGUMENT: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
[[{{node map/TensorArrayUnstack/TensorListFromTensor/_42}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_adapt_step_77044]
Function call stack:
adapt_step -> adapt_step
I just stumbled over the same problem when trying to run the code in a Jupyter lab notebook on a Jetson Nano (TF 2.7)
I managed to get it to work there by specifying CPU as the device for the adapt
operation:
with tf.device("cpu"):
text_vectorization.adapt(text_only_train_ds)
YES!!! I have received your E-mail——Steven Lee