TSFD Could you please help analyze what went wrong with the reproduction process

@Mr-TalhaIlyas Hello! Thank you for opening up the source code for such an exciting work. I encountered some issues while replicating your TSFD work. The evaluation matrix obtained through replication is as follows:

Metric	Value
loss	0.0427
clf_out_loss	0.012
seg_out_loss	0.1062
inst_out_loss	-0.0975
clf_out_accuracy	0.5502
seg_out_mean_iou	0.261
inst_out_mean_iou	0.3711

When using the weight file model.h5 you provided, the results are as follows

Metric	Value
loss	-0.0267
clf_out_loss	0.0015
seg_out_loss	0.0774
inst_out_loss	-0.1086
clf_out_accuracy	0.9345
seg_out_mean_iou	0.3388
inst_out_mean_iou	0.4078

We have carefully checked every step of the code and configured it according to the suggestions in your paper. But the results are still disappointing as shown in the above table. We noticed that the loss did not fully converge. Currently, epoch is set to 150. May I ask if this result is caused by a too small epoch ? If so, how much should epoch be set appropriately? If not, do you have any suggestions?

I would be extremely grateful if you could provide some suggestions to help reproduce successfully. thank you!

May 22 '23 03:05 uiloatoat

Hi @uiloatoat I checked again but everting on my side is working fine. I can think of few things.

how did you prepare the dataset using the same repo that I used. What about labels did you swap the channels so that BG is first channel.
When did you downloaded the data?
I just checked the original repository of the dataset, It has been updated 7 times as of now. I downloaded the data when v1 was latest now v7 is the latest version. Kindly have a look at fig 7 in both papers the distribution of dataset has changed greatly. So, the problem might be due to the changed distribution.

May 23 '23 03:05 Mr-TalhaIlyas

Dear @Mr-TalhaIlyas: Thank you for your reminder! I really didn't notice the database version change without your reminder. I downloaded the data two weeks ago. The difference in database versions did indeed lead to a noticeable disparity in the data distribution. I have reattempted the replication process using your Pannuke preprocessing repo for data preparation and division without modify any codes. Subsequently, I trained TSFD-net. The only modification I made was in the loss.py because the configuration of the loss function in your repo is different from the one presented in your paper. The modified loss.py is as follows:

def FocalLoss(y_true, y_pred):    

    alpha = 0.3
    gamma = 5
    ...

    return focal_loss

'''
Seg losses
'''
def SEG_Loss(y_true, y_pred):

  loss = FocalTverskyLoss(y_true, y_pred, smooth=1e-6) + [0.4 * Weighted_BCEnDice_loss(y_true, y_pred)]

  return tf.math.reduce_mean(loss)
  
def INST_Loss(y_true, y_pred):

  loss = FocalTverskyLoss(y_true, y_pred, smooth=1e-6) + [0.4 * Combo_loss(y_true, y_pred)]

  return tf.math.reduce_mean(loss)


def FocalTverskyLoss(y_true, y_pred, smooth=1e-6):
        
        if y_pred.shape[-1] <= 1:
            alpha = 0.3
            beta = 0.7
            gamma = 5 #5.
            y_pred = tf.keras.activations.sigmoid(y_pred)

        elif y_pred.shape[-1] >= 2:
            alpha = 0.3
            beta = 0.7
            gamma = 3 #3.
            y_pred = tf.keras.activations.softmax(y_pred, axis=-1)
            y_true = K.squeeze(y_true, 3)
            y_true = tf.cast(y_true, "int32")
            y_true = tf.one_hot(y_true, num_class, axis=-1)
        
        ...
        
        return FocalTversky


def Combo_loss(y_true, y_pred, smooth=1):
 
 e = K.epsilon()
 if y_pred.shape[-1] <= 1:
   ALPHA = 0.4    # < 0.5 penalises FP more, > 0.5 penalises FN more
   CE_RATIO = 0.7 # weighted contribution of modified CE loss compared to Dice loss
   y_pred = tf.keras.activations.sigmoid(y_pred)
 elif y_pred.shape[-1] >= 2:
   ALPHA = 0.3    # < 0.5 penalises FP more, > 0.5 penalises FN more
   CE_RATIO = 0.7 # weighted contribution of modified CE loss compared to Dice loss
   y_pred = tf.keras.activations.softmax(y_pred, axis=-1)
   y_true = K.squeeze(y_true, 3)
   y_true = tf.cast(y_true, "int32")
   y_true = tf.one_hot(y_true, num_class, axis=-1)
   
 ...
 
 return combo

The results after training are as follows:

Nuclei Type	PQ
Neoplastic	0.5051
Inflammatory	0.4023
Connective	0.3721
Dead	0.0995
Non-Neoplastic	0.5047

Tissue Type	mPQ	bPQ
Adrenal_gland	0.4659	0.6454
Bile-duct	0.4846	0.6219
Bladder	0.5171	0.6569
Breast	0.4664	0.5989
Cervix	0.4856	0.6288
Colon	0.3847	0.4969
Esophagus	0.5133	0.6035
HeadNeck	0.4639	0.5774
Kidney	0.4524	0.6243
Liver	0.4949	0.6541
Lung	0.3334	0.5298
Ovarian	0.4864	0.6156
Pancreatic	0.4808	0.5943
Prostate	0.4709	0.6243
Skin	0.331	0.569
Stomach	0.3299	0.5853
Testis	0.4562	0.5997
Thyroid	0.4048	0.6224
Uterus	0.4375	0.5905
Average	0.4453	0.602

There is still a gap between this result and your implement. Could you please help to check whether the setting of loss is feasible? Are there any other factors affecting the results?

Your work inspires us a lot. Looking forward to your guidance. Thank you again！

May 25 '23 08:05 uiloatoat

TSFD TSFD copied to clipboard

Could you please help analyze what went wrong with the reproduction process

TSFD
TSFD copied to clipboard