keras-preprocessing icon indicating copy to clipboard operation
keras-preprocessing copied to clipboard

Unable to use flow_from_dataframe - y_col must be str,list,tuple

Open obiii opened this issue 5 years ago • 1 comments

Hi,

i am trying to train a multi task CNN using flow_from_dataframe. The columns in dataframe are already in str format but the dtypes shows "Object" no matter what I use to convert them to string. Seems pandas uses object even for str now.

The dataframe has these columns:

Image PFRType FuelType image1.jpg 1-3 NG

Image object PFRType object FuelType object dir object dtype: object

And I get this error: If class_mode="sparse", y_col="['PFRType', 'FuelType']" column values must be strings.

here is the code for generator

trainGen = ImageDataGenerator()
trainGenDf = trainGen.flow_from_dataframe(trainLabel,
                                         directory = '../MTLData/train/',
                                         x_col = "Image",y_col=['PFRType', 'FuelType'],
                                         class_mode='sparse',
                                         target_size=(224,224),
                                         batch_size=32)

I am using Keras Version: 2.3.1 Can someone please help?

obiii avatar Mar 11 '20 10:03 obiii

I know this is a very old question on a defunct message board, but given that this still shows up in search results (and I was having a similar issue), the solution I found was to first turn my multiple columns in a new column in my dataframe that is a list or tuple itself.

dataframe['combined_classes'] = dataframe[('PFRType', 'FuelType')].apply(lambda x: x.tolist(), axis=1)
trainGen = ImageDataGenerator()
trainGenDf = trainGen.flow_from_dataframe(dataframe,
                                         directory = '../MTLData/train/',
                                         x_col = "Image",
                                         y_col='combined_classes',
                                         class_mode='sparse',
                                         target_size=(224,224),
                                         batch_size=32)

I'm sure you're not still working on this, but wanted to share my workaround anyways in case anyone else was looking for the answer like I was.

HanClinto avatar Jul 20 '22 15:07 HanClinto