TSFD
TSFD copied to clipboard
Could you please help analyze what went wrong with the reproduction process
@Mr-TalhaIlyas Hello! Thank you for opening up the source code for such an exciting work. I encountered some issues while replicating your TSFD work. The evaluation matrix obtained through replication is as follows:
| Metric | Value |
|---|---|
| loss | 0.0427 |
| clf_out_loss | 0.012 |
| seg_out_loss | 0.1062 |
| inst_out_loss | -0.0975 |
| clf_out_accuracy | 0.5502 |
| seg_out_mean_iou | 0.261 |
| inst_out_mean_iou | 0.3711 |
When using the weight file model.h5 you provided, the results are as follows
| Metric | Value |
|---|---|
| loss | -0.0267 |
| clf_out_loss | 0.0015 |
| seg_out_loss | 0.0774 |
| inst_out_loss | -0.1086 |
| clf_out_accuracy | 0.9345 |
| seg_out_mean_iou | 0.3388 |
| inst_out_mean_iou | 0.4078 |
We have carefully checked every step of the code and configured it according to the suggestions in your paper. But the results are still disappointing as shown in the above table. We noticed that the loss did not fully converge. Currently, epoch is set to 150. May I ask if this result is caused by a too small epoch ? If so, how much should epoch be set appropriately? If not, do you have any suggestions?
I would be extremely grateful if you could provide some suggestions to help reproduce successfully. thank you!
Hi @uiloatoat I checked again but everting on my side is working fine. I can think of few things.
- how did you prepare the dataset using the same repo that I used. What about labels did you swap the channels so that BG is first channel.
- When did you downloaded the data?
- I just checked the original repository of the dataset, It has been updated 7 times as of now. I downloaded the data when v1 was latest now v7 is the latest version. Kindly have a look at fig 7 in both papers the distribution of dataset has changed greatly. So, the problem might be due to the changed distribution.
Dear @Mr-TalhaIlyas:
Thank you for your reminder! I really didn't notice the database version change without your reminder. I downloaded the data two weeks ago. The difference in database versions did indeed lead to a noticeable disparity in the data distribution.
I have reattempted the replication process using your Pannuke preprocessing repo for data preparation and division without modify any codes. Subsequently, I trained TSFD-net. The only modification I made was in the loss.py because the configuration of the loss function in your repo is different from the one presented in your paper. The modified loss.py is as follows:
def FocalLoss(y_true, y_pred):
alpha = 0.3
gamma = 5
...
return focal_loss
'''
Seg losses
'''
def SEG_Loss(y_true, y_pred):
loss = FocalTverskyLoss(y_true, y_pred, smooth=1e-6) + [0.4 * Weighted_BCEnDice_loss(y_true, y_pred)]
return tf.math.reduce_mean(loss)
def INST_Loss(y_true, y_pred):
loss = FocalTverskyLoss(y_true, y_pred, smooth=1e-6) + [0.4 * Combo_loss(y_true, y_pred)]
return tf.math.reduce_mean(loss)
def FocalTverskyLoss(y_true, y_pred, smooth=1e-6):
if y_pred.shape[-1] <= 1:
alpha = 0.3
beta = 0.7
gamma = 5 #5.
y_pred = tf.keras.activations.sigmoid(y_pred)
elif y_pred.shape[-1] >= 2:
alpha = 0.3
beta = 0.7
gamma = 3 #3.
y_pred = tf.keras.activations.softmax(y_pred, axis=-1)
y_true = K.squeeze(y_true, 3)
y_true = tf.cast(y_true, "int32")
y_true = tf.one_hot(y_true, num_class, axis=-1)
...
return FocalTversky
def Combo_loss(y_true, y_pred, smooth=1):
e = K.epsilon()
if y_pred.shape[-1] <= 1:
ALPHA = 0.4 # < 0.5 penalises FP more, > 0.5 penalises FN more
CE_RATIO = 0.7 # weighted contribution of modified CE loss compared to Dice loss
y_pred = tf.keras.activations.sigmoid(y_pred)
elif y_pred.shape[-1] >= 2:
ALPHA = 0.3 # < 0.5 penalises FP more, > 0.5 penalises FN more
CE_RATIO = 0.7 # weighted contribution of modified CE loss compared to Dice loss
y_pred = tf.keras.activations.softmax(y_pred, axis=-1)
y_true = K.squeeze(y_true, 3)
y_true = tf.cast(y_true, "int32")
y_true = tf.one_hot(y_true, num_class, axis=-1)
...
return combo
The results after training are as follows:
| Nuclei Type | PQ |
|---|---|
| Neoplastic | 0.5051 |
| Inflammatory | 0.4023 |
| Connective | 0.3721 |
| Dead | 0.0995 |
| Non-Neoplastic | 0.5047 |
| Tissue Type | mPQ | bPQ |
|---|---|---|
| Adrenal_gland | 0.4659 | 0.6454 |
| Bile-duct | 0.4846 | 0.6219 |
| Bladder | 0.5171 | 0.6569 |
| Breast | 0.4664 | 0.5989 |
| Cervix | 0.4856 | 0.6288 |
| Colon | 0.3847 | 0.4969 |
| Esophagus | 0.5133 | 0.6035 |
| HeadNeck | 0.4639 | 0.5774 |
| Kidney | 0.4524 | 0.6243 |
| Liver | 0.4949 | 0.6541 |
| Lung | 0.3334 | 0.5298 |
| Ovarian | 0.4864 | 0.6156 |
| Pancreatic | 0.4808 | 0.5943 |
| Prostate | 0.4709 | 0.6243 |
| Skin | 0.331 | 0.569 |
| Stomach | 0.3299 | 0.5853 |
| Testis | 0.4562 | 0.5997 |
| Thyroid | 0.4048 | 0.6224 |
| Uterus | 0.4375 | 0.5905 |
| Average | 0.4453 | 0.602 |
There is still a gap between this result and your implement. Could you please help to check whether the setting of loss is feasible? Are there any other factors affecting the results?
Your work inspires us a lot. Looking forward to your guidance. Thank you again!