keras Please explain label_mode='int' in keras.utils.image_dataset_from

Not a problem but question. I train my model with images gathered to dataset by keras.utils.image_dataset_from_directory:

train_ds, val_ds = keras.utils.image_dataset_from_directory(
    directory=ds_path,
    labels='inferred',
    label_mode='int',
    color_mode = 'rgb',
    batch_size=BS,
    image_size=(IMG_SIZE, IMG_SIZE),
    interpolation = 'bilinear',
    shuffle=True,
    seed=314,
    validation_split = 0.2,
    subset = 'both'
    )

My data directory includes folders with names 0, 1, 2, ... 10 which are class numbers, not class names. Traning process is simple:

keras.backend.clear_session() 
# load base model
 base_model = keras.applications.EfficientNetV2L(
     include_top=False,
     weights="imagenet",
     input_tensor=None,
     input_shape=INPUT_SHAPE,
     pooling=None,
     classes=1000,
     classifier_activation="softmax",
     include_preprocessing=True,
 )
 # freeze layers
 base_model.trainable = False
 # add head
 inputs = keras.Input(shape=INPUT_SHAPE)
 x = base_model(inputs, training=False)
 x = keras.layers.GlobalAveragePooling2D()(x)
 outputs = keras.layers.Dense(CLASS_CNT)(x)
 model = keras.Model(inputs, outputs)
 # best epoch saving
 ModelCheckpoint = keras.callbacks.ModelCheckpoint(
   filepath = colab_path,
   monitor="val_accuracy",
   verbose=0,
   save_best_only=True,
   save_weights_only=False,
   mode="auto",
   save_freq="epoch",
   initial_value_threshold=None,
 )
 model.compile(
     optimizer=keras.optimizers.Adam(learning_rate=lr),
     loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
     metrics=[keras.metrics.SparseCategoricalAccuracy(), "accuracy"]
     )
 head_train_hist = model.fit(
     train_ds,
     epochs=5,
     validation_data = val_ds,
     callbacks = [ModelCheckpoint]
     )

and both metrics are the same during traning process.

Manual evaluation results on test dataset is consistent with training results (sorry for lame code below). The structure of test_dir is the same as train dataset has: folders with names 0, 1, ... 10 and images in them.

best_model_path = colab_path
model = tf.keras.models.load_model(best_model_path)
# create d-set
test_ds = keras.utils.image_dataset_from_directory(
    directory=test_dir, 
    labels='inferred',
    label_mode='int',
    color_mode = 'rgb',
    batch_size=BS,
    image_size=(IMG_SIZE, IMG_SIZE),
    interpolation = 'bilinear',
    shuffle=True,
    seed=314,
    )

y_pred = []
y_true = []

for image_batch, label_batch in test_ds:
  for lbl in label_batch:
    y_true.append(lbl)
  preds = model.predict(image_batch)
  for lbl in preds:
    y_pred.append(np.argmax(lbl, axis = - 1))
tf.math.confusion_matrix(y_true, y_pred, num_classes= CLASS_CNT)

Question is what exactly return np.argmax(lbl, axis = - 1) when i get, for instance, 5? Is it folder name or is it class number in alphanumerically ordered class labels converted to strings that is '0', '1', '10', '2',....'9'?

In first case when np.argmax(lbl, axis = - 1) returns 5 i think that the image is the same as gathered in folder 5. In second case when np.argmax(lbl, axis = - 1) returns 5, i think that the image is the same as gathered in folder 6.

In first case 3rd row in tf.math.confusion_matrix(y_true , y_pred ) corresponds to images in folder 2. In second case 3rd row in tf.math.confusion_matrix(y_true , y_pred ) corresponds to images in folder 10.

May 26 '24 10:05 satyrmipt

I asked this question because when i evaluate model on single image like this:

    img = Image.open(img_name) # from PIL import Image
    img = img.resize((IMG_SIZE, IMG_SIZE), resample=PIL.Image.BILINEAR) # import PIL
    img=img.convert('RGB')
    img = np.expand_dims(np.asarray(img), axis=0)  # model was trained on batches so add one dimension for batch
    logits_array=tl_head_model.predict(img, verbose=0)[0]
    y_pred= np.argmax(logits_array)

my predictions are good for images in folder 0, 1 and shifted for all other folder: expect 3, got 4, expect 4, got 5, expect 2 got 10.

May 26 '24 10:05 satyrmipt

Hi @satyrmipt -

Here in keras.utils.image_dataset_from_directory, label_mode is a string describing encoding of labels. label_mode='int' means labels are encoded as integer. label_mode='categorical' means labels are encoded as categorical. Find more details regarding keras.utils.image_dataset_from_directory here.

As per your data directory folder is based on class names. So this "np.argmax(lbl, axis = - 1)" will return class number in alphanumerically ordered class labels converted to strings like it will return index 2 for class 10(10 folder). Because keras.utils.image_dataset_from_directory uses labels sorted according to the alphanumeric order of the image file paths.

May 29 '24 07:05 mehtamansi29

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

Jun 19 '24 01:06 github-actions[bot]

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

Jul 04 '24 01:07 github-actions[bot]

Are you satisfied with the resolution of your issue? Yes No

Jul 04 '24 01:07 google-ml-butler[bot]

keras
keras copied to clipboard

Please explain label_mode='int' in keras.utils.image_dataset_from_directory

keras keras copied to clipboard

Please explain label_mode='int' in keras.utils.image_dataset_from_directory

keras
keras copied to clipboard