ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

[Nano] `RuntimeError: Inter op parallelism cannot be modified after initialization` when importing `Model`

Open Oscilloscope98 opened this issue 1 year ago • 2 comments

The problem occurred when importing Model from bigdl.nano.tf.keras after we have used tensorflow for creating datasets, etc.

Example problematic code:

import tensorflow as tf
import tensorflow_datasets as tfds

def create_datasets(img_size, batch_size):
    (train_ds, test_ds), info = tfds.load('imagenette/320px-v2',
                                          data_dir='/tmp/data',
                                          split=['train', 'validation'],
                                          with_info=True,
                                          as_supervised=True)
    
    num_classes = info.features['label'].num_classes
    
    def preprocessing(img, label):
        return tf.image.resize(img, (img_size, img_size)), \
               tf.one_hot(label, num_classes)

    train_ds = train_ds.repeat().map(preprocessing).batch(batch_size)
    test_ds = test_ds.map(preprocessing).batch(batch_size)
    return train_ds, test_ds, info

train_ds, test_ds, ds_info = create_datasets(img_size=224, batch_size=32)

from bigdl.nano.tf.keras import Model # <= error occurs here

Error messages:

RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_31179/1368016327.py in <module>
     22 train_ds, test_ds, ds_info = create_datasets(img_size=224, batch_size=32)
     23 
---> 24 from bigdl.nano.tf.keras import Model # <= error occurs here

~/miniconda3/envs/temp-tf/lib/python3.7/site-packages/bigdl/nano/tf/__init__.py in <module>
     20     import tensorflow as tf
     21     if "NANO_TF_INTER_OP" in os.environ:
---> 22         tf.config.threading.set_inter_op_parallelism_threads(int(os.environ["NANO_TF_INTER_OP"]))
     23     else:
     24         warnings.warn("NANO_TF_INTER_OP not found the in os.environ, "

~/miniconda3/envs/temp-tf/lib/python3.7/site-packages/tensorflow/python/framework/config.py in set_inter_op_parallelism_threads(num_threads)
    146     num_threads: Number of parallel threads
    147   """
--> 148   context.context().inter_op_parallelism_threads = num_threads
    149 
    150 

~/miniconda3/envs/temp-tf/lib/python3.7/site-packages/tensorflow/python/eager/context.py in inter_op_parallelism_threads(self, num_threads)
   1749     if self._context_handle is not None:
   1750       raise RuntimeError(
-> 1751           "Inter op parallelism cannot be modified after initialization.")
   1752 
   1753     self._inter_op_parallelism_threads = num_threads

RuntimeError: Inter op parallelism cannot be modified after initialization.

Environment:

bigdl-nano                   2.1.0b20220918
intel_tensorflow             2.7.0
tensorflow-datasets          4.6.0
tensorflow-estimator         2.7.0
tensorflow-io-gcs-filesystem 0.27.0
tensorflow-metadata          1.10.0

Similar problems happened when importing bigdl.nano.tf.keras.layers.Embedding, bigdl.nano.tf.optimizers.SparseAdam, etc.

Please refer here for more information: https://github.com/intel-analytics/BigDL/pull/5836#issuecomment-1254200961

Oscilloscope98 avatar Sep 22 '22 01:09 Oscilloscope98

created a tensorflow link here: https://github.com/tensorflow/tensorflow/issues/57812

yangw1234 avatar Sep 22 '22 21:09 yangw1234

add known issues here https://github.com/intel-analytics/BigDL/pull/5923

yangw1234 avatar Sep 22 '22 22:09 yangw1234

Hi @Oscilloscope98 , could you help verify that https://github.com/intel-analytics/BigDL/pull/5923/files fixed the problem. I found a way to reset the eager session context.

image

yangw1234 avatar Sep 25 '22 16:09 yangw1234

@yangw1234 When added the changes in #5923 , the RuntimeError: Inter op parallelism cannot be modified after initialization. disappears. But for the following codes, new error occurs. Example code:

import tensorflow as tf
import tensorflow_datasets as tfds

def create_datasets(img_size, batch_size):
    (train_ds, test_ds), info = tfds.load('imagenette/320px-v2',
                                          data_dir='/tmp/data',
                                          split=['train', 'validation'],
                                          with_info=True,
                                          as_supervised=True)
    
    num_classes = info.features['label'].num_classes
    
    def preprocessing(img, label):
        return tf.image.resize(img, (img_size, img_size)), \
               tf.one_hot(label, num_classes)

    train_ds = train_ds.repeat().map(preprocessing).batch(batch_size)
    test_ds = test_ds.map(preprocessing).batch(batch_size)
    return train_ds, test_ds, info

train_ds, test_ds, ds_info = create_datasets(img_size=224, batch_size=32)

from bigdl.nano.tf.keras import Model # <= error occurs here

Error:

Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
    context.remove_function(self.name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
    context().remove_function(name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
    pywrap_tfe.TFE_ContextRemoveFunction(self._handle, name)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_flat_map_read_one_file_20'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
    context.remove_function(self.name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
    context().remove_function(name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
    pywrap_tfe.TFE_ContextRemoveFunction(self._handle, name)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_flat_map_read_one_file_83'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_interleave_classfunctools.partial_189'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_parse_and_decode_218'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_lookup_nest_226'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_preprocessing_309'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_interleave_classfunctools.partial_252'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_parse_and_decode_281'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_lookup_nest_289'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f073d813d40>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_preprocessing_328'.

Although it throws the above exception this time, the following training processes can be successfully run:

import tensorflow as tf
import tensorflow_datasets as tfds

def create_datasets(img_size, batch_size):
    (train_ds, test_ds), info = tfds.load('imagenette/320px-v2',
                                          data_dir='/tmp/data',
                                          split=['train', 'validation'],
                                          with_info=True,
                                          as_supervised=True)
    
    num_classes = info.features['label'].num_classes
    
    def preprocessing(img, label):
        return tf.image.resize(img, (img_size, img_size)), \
               tf.one_hot(label, num_classes)

    train_ds = train_ds.repeat().map(preprocessing).batch(batch_size)
    test_ds = test_ds.map(preprocessing).batch(batch_size)
    return train_ds, test_ds, info

train_ds, test_ds, ds_info = create_datasets(img_size=224, batch_size=32)

from bigdl.nano.tf.keras import Model # <= our Model is imported here with above exception
# but the following code is successfully executed

from tensorflow.keras import layers
from tensorflow.keras.applications import ResNet50

def define_model_inputs_outputs(num_classes, img_size):
    inputs = tf.keras.layers.Input(shape=(img_size, img_size, 3))
    x = tf.cast(inputs, tf.float32)
    x = tf.keras.applications.resnet50.preprocess_input(x)
    backbone = ResNet50(weights='imagenet')
    backbone.trainable = False
    x = backbone(x)
    x = layers.Dense(512, activation='relu')(x)
    outputs = layers.Dense(num_classes, activation='softmax')(x)

    return inputs, outputs

inputs, outputs = define_model_inputs_outputs(num_classes=ds_info.features['label'].num_classes, 
                                              img_size=224)


model = Model(inputs=inputs, outputs=outputs)
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])

model.fit(train_ds,
          epochs=1,
          steps_per_epoch=(ds_info.splits['train'].num_examples // 32),
          num_processes=2)

Full running log:

2022-09-26 14:25:49.056577: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-26 14:25:49.056980: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
    context.remove_function(self.name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
    context().remove_function(name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
    pywrap_tfe.TFE_ContextRemoveFunction(self._handle, name)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_flat_map_read_one_file_20'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
    context.remove_function(self.name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
    context().remove_function(name)
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
    pywrap_tfe.TFE_ContextRemoveFunction(self._handle, name)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_flat_map_read_one_file_83'.
2022-09-26 14:25:59.803221: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/keras/engine/functional.py:1410: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  layer_config = serialize_layer_fn(layer)
/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/keras/saving/saved_model/layer_serialization.py:112: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  return generic_utils.serialize_keras_object(obj)
2022-09-26 14:26:11.930268: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-26 14:26:11.933234: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:272] Initialize GrpcChannelCache for job worker -> {0 -> localhost:55938, 1 -> localhost:49198}
2022-09-26 14:26:11.933382: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:427] Started server with target: grpc://localhost:55938
2022-09-26 14:26:11.987982: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-26 14:26:11.990847: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:272] Initialize GrpcChannelCache for job worker -> {0 -> localhost:55938, 1 -> localhost:49198}
2022-09-26 14:26:11.990989: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:427] Started server with target: grpc://localhost:49198
2022-09-26 14:26:22.556611: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
2022-09-26 14:26:22.558011: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
2022-09-26 14:26:22.619641: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-09-26 14:26:22.619763: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
295/295 [==============================] - 325s 1s/step - loss: 0.5200 - accuracy: 0.9636
2022-09-26 14:31:55.091407: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2022-09-26 14:31:55.199643: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/keras/engine/functional.py:1410: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  layer_config = serialize_layer_fn(layer)
/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/keras/engine/functional.py:1410: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  layer_config = serialize_layer_fn(layer)
/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/keras/saving/saved_model/layer_serialization.py:112: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  return generic_utils.serialize_keras_object(obj)
/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/keras/saving/saved_model/layer_serialization.py:112: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  return generic_utils.serialize_keras_object(obj)
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_interleave_classfunctools.partial_189'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_parse_and_decode_218'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_lookup_nest_226'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_preprocessing_309'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_interleave_classfunctools.partial_252'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_parse_and_decode_281'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_lookup_nest_289'.
Exception ignored in: <function _EagerDefinedFunctionDeleter.__del__ at 0x7f4663d05cb0>
Traceback (most recent call last):
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 414, in __del__
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 2584, in remove_function
  File "/home/yuwen/miniconda3/envs/temp2/lib/python3.7/site-packages/tensorflow/python/eager/context.py", line 1287, in remove_function
tensorflow.python.framework.errors_impl.InvalidArgumentError: Tried to remove non-existent function '__inference_Dataset_map_preprocessing_328'.

Oscilloscope98 avatar Sep 26 '22 06:09 Oscilloscope98

@yangw1234 When testing the following code for Embedding and SparseAdam, I also met the following errors: Example code:

import re
import string
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras.layers import TextVectorization

    (raw_train_ds, raw_val_ds, raw_test_ds), info = tfds.load(
        "imdb_reviews",
        data_dir="/tmp/data",
        split=['train[:80%]', 'train[80%:]', 'test'],
        as_supervised=True,
        batch_size=32,
        with_info=True
    )

    def custom_standardization(input_data):
        lowercase = tf.strings.lower(input_data)
        stripped_html = tf.strings.regex_replace(lowercase, "<br />", " ")
        return tf.strings.regex_replace(
            stripped_html, f"[{re.escape(string.punctuation)}]", ""
        )

    vectorize_layer = TextVectorization(
        standardize=custom_standardization,
        max_tokens=20000,
        output_mode="int",
        output_sequence_length=500,
    )
    
    text_ds = raw_train_ds.map(lambda x, y: x)
    vectorize_layer.adapt(text_ds)

    def vectorize_text(text, label):
        text = tf.expand_dims(text, -1)
        return vectorize_layer(text), label

    # vectorize the data
    train_ds = raw_train_ds.map(vectorize_text)
    val_ds = raw_val_ds.map(vectorize_text)
    test_ds = raw_test_ds.map(vectorize_text)

    return train_ds, val_ds, test_ds

train_ds, val_ds, test_ds = create_datasets()

inputs = tf.keras.Input(shape=(None,), dtype="int64")

from bigdl.nano.tf.keras.layers import Embedding # import Embedding here
x = Embedding(input_dim=20000, output_dim=128)(inputs)

from tensorflow.keras import layers
from bigdl.nano.tf.keras import Model # import Model here

def make_backbone():
    inputs = tf.keras.Input(shape=(None, 128))
    x = layers.Dropout(0.5)(inputs)
    x = layers.Conv1D(128, 7, padding="valid", activation="relu", strides=3)(x)
    x = layers.Conv1D(128, 7, padding="valid", activation="relu", strides=3)(x)
    x = layers.GlobalMaxPooling1D()(x)
    x = layers.Dense(128, activation="relu")(x)
    x = layers.Dropout(0.5)(x)
    predictions = layers.Dense(1, activation="sigmoid", name="predictions")(x)

    model = Model(inputs, predictions)
    return model

from bigdl.nano.tf.optimizers import SparseAdam #import SparseAdam here
predictions = make_backbone()(x)
model = Model(inputs, predictions)

model.compile(loss="binary_crossentropy", optimizer=SparseAdam(), metrics=["accuracy"])
model.fit(train_ds, validation_data=val_ds, epochs=1) # <= error occurs here
model.evaluate(test_ds)

Error:

2022-09-26 15:07:11.929913: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at lookup_table_op.cc:911 : NOT_FOUND: Resource localhost/1576/N10tensorflow6lookup15LookupInterfaceE does not exist.
Traceback (most recent call last):
  File "test/test2.py", line 74, in <module>
    model.fit(train_ds, validation_data=val_ds, epochs=1)
  File "/home/yuwen/BigDL/python/nano/src/bigdl/nano/tf/keras/training_utils.py", line 118, in fit
    return self.fit_old(**fit_kwargs)
  File "/home/yuwen/miniconda3/envs/temp3/lib/python3.7/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/yuwen/miniconda3/envs/temp3/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.NotFoundError:  Resource localhost/1576/N10tensorflow6lookup15LookupInterfaceE does not exist.
         [[{{node text_vectorization/string_lookup/None_Lookup/LookupTableFindV2}}]]
         [[IteratorGetNext]] [Op:__inference_train_function_2897]

which avoids the fit function to run successfully.

Oscilloscope98 avatar Sep 26 '22 07:09 Oscilloscope98

Maybe we should ask user to add import bigdl.nano.tf at the top of their main file.

yangw1234 avatar Sep 26 '22 22:09 yangw1234