ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

Exception happened if using orca estimator train tensorflow.keras model with xshards of pandas dataframe

Open dding3 opened this issue 2 years ago • 7 comments

The exception is:

****Usage Error model_input number does not match data number, got model_input ['dense_input'], data [TensorMeta(dtype: int64, name: list_input_0, shape: ()), TensorMeta(dtype: int64, name: list_input_1, shape: ()), TensorMeta(dtype: int64, name: list_input_2, shape: ()), TensorMeta(dtype: int64, name: list_input_3, shape: ()), TensorMeta(dtype: int64, name: list_input_4, shape: ()), TensorMeta(dtype: float64, name: list_input_5, shape: ()), TensorMeta(dtype: float64, name: list_input_6, shape: ()), TensorMeta(dtype: int64, name: list_input_7, shape: ())]

***Call Stack Traceback (most recent call last): File "/home/ding/proj/spark-dl-master/BigDL/python/orca/example/shard_keras_tutorial.py", line 46, in label_cols=['label'], File "/home/ding/.local/lib/python3.6/site-packages/bigdl/orca/learn/tf/estimator.py", line 893, in fit optimizer=self.optimizer) File "/home/ding/.local/lib/python3.6/site-packages/bigdl/orca/tfpark/tf_optimizer.py", line 631, in from_keras check_data_compatible(dataset, keras_model, mode="train") File "/home/ding/.local/lib/python3.6/site-packages/bigdl/orca/tfpark/tf_dataset.py", line 1324, in check_data_compatible _check_compatible(input_names, feature, data_type="model_input") File "/home/ding/.local/lib/python3.6/site-packages/bigdl/orca/tfpark/tf_dataset.py", line 1308, in _check_compatible invalidInputError(len(nest.flatten(structure)) == len(names), err_msg) File "/home/ding/.local/lib/python3.6/site-packages/bigdl/dllib/utils/log4Error.py", line 33, in invalidInputError raise RuntimeError(errMsg) RuntimeError: model_input number does not match data number, got model_input ['dense_input'], data [TensorMeta(dtype: int64, name: list_input_0, shape: ()), TensorMeta(dtype: int64, name: list_input_1, shape: ()), TensorMeta(dtype: int64, name: list_input_2, shape: ()), TensorMeta(dtype: int64, name: list_input_3, shape: ()), TensorMeta(dtype: int64, name: list_input_4, shape: ()), TensorMeta(dtype: float64, name: list_input_5, shape: ()), TensorMeta(dtype: float64, name: list_input_6, shape: ()), TensorMeta(dtype: int64, name: list_input_7, shape: ())]

It can be reproed by below code


from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

import bigdl.orca.data.pandas
from bigdl.orca import init_orca_context, stop_orca_context
from bigdl.orca.learn.tf.estimator import Estimator

init_orca_context(cluster_mode="local", cores=4, memory="3g")

path = 'pima-indians-diabetes-test.csv'
data_shard = bigdl.orca.data.pandas.read_csv(path)

model = Sequential()
model.add(Dense(12, input_shape=(8,), activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

est = Estimator.from_keras(keras_model=model)
est.fit(data=data_shard,
        batch_size=16,
        epochs=150,
        feature_cols=['f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8'],
        label_cols=['label'],
        )
results = est.evaluate(data_shard)

The data can be downloaded from Almaren-Gateway:/mnt/md0/home/ding.ding/pima-indians-diabetes-test.csv

dding3 avatar Jul 10 '22 23:07 dding3

@sgwhat please take a look

jason-dai avatar Jul 12 '22 23:07 jason-dai

@sgwhat please take a look

Sure.

sgwhat avatar Jul 13 '22 01:07 sgwhat

Similar issue with #4965

shanyu-sys avatar Jul 13 '22 03:07 shanyu-sys

Add a transform function to solve the error.

def transform(df):
        result = {
                "x": np.stack([df['f1'].to_numpy(), ..., df['f8'].to_numpy()], axis=1),
                "y": df['label'].to_numpy()}
        return result

data_shard = data_shard.transform_shard(transform)

est = Estimator.from_keras(keras_model=model)
est.fit(data=data_shard,
        batch_size=16,
        epochs=150)

sgwhat avatar Jul 13 '22 03:07 sgwhat

Add a transform function to solve the error.

def transform(df):
        result = {
                "x": np.stack([df['f1'].to_numpy(), ..., df['f8'].to_numpy()], axis=1),
                "y": df['label'].to_numpy()}
        return result

data_shard = data_shard.transform_shard(transform)

est = Estimator.from_keras(keras_model=model)
est.fit(data=data_shard,
        batch_size=16,
        epochs=150)

Thank you for the reply. Can we move the convertion logic data_shard = data_shard.transform_shard(transform) inside estimator.fit, it may not easy to explain to user why we need the do the transform before call fit

dding3 avatar Jul 13 '22 03:07 dding3

Add a transform function to solve the error.

def transform(df):
        result = {
                "x": np.stack([df['f1'].to_numpy(), ..., df['f8'].to_numpy()], axis=1),
                "y": df['label'].to_numpy()}
        return result

data_shard = data_shard.transform_shard(transform)

est = Estimator.from_keras(keras_model=model)
est.fit(data=data_shard,
        batch_size=16,
        epochs=150)

Thank you for the reply. Can we move the convertion logic data_shard = data_shard.transform_shard(transform) inside estimator.fit, it may not easy to explain to user why we need the do the transform before call fit

For sure, we could do it.

sgwhat avatar Jul 13 '22 05:07 sgwhat

See https://github.com/intel-analytics/BigDL/issues/4965#issuecomment-1184515330

jason-dai avatar Jul 14 '22 14:07 jason-dai