Patch-GCN icon indicating copy to clipboard operation
Patch-GCN copied to clipboard

c-index value on validation set is very high

Open genhao3 opened this issue 2 years ago • 2 comments

In the BRCA dataset, the c-index of the training set is around 0.67, while the c-index of the validation set reaches 0.82+. Is this normal?

genhao3 avatar Jul 08 '22 09:07 genhao3

截屏2022-07-14 下午5 31 34

genhao3 avatar Jul 14 '22 09:07 genhao3

Hi @genhao3,

To help you debug, in your implementation, what is the AMIL benchmark for survival analysis for TCGA-BRCA? What kind of extracted patch embeddings did you use? Attached below are results for TCGA-BRCA and TCGA-UCEC respectively for Patch-GCN. Is it high for all splits?

Regarding the validation loss function being unstable, I did observe some issues as well. Survival analysis is a very difficult problem that is prone to issues such as 1) high censorship, and 2) differing survival and censorship distributions between train-validation. For cancer types such as UCEC which has less censorship, I do observe that the validation c-index increases with epoch training, but not for BRCA which is more difficult.

TCGA-BRCA image

TCGA-UCEC image (1)

Richarizardd avatar Aug 22 '22 04:08 Richarizardd

截屏2022-07-14 下午5 31 34

你好,我跑了LUAD数据集,val c index 依然过高,请问你解决这个问题了吗 image

Raymvp avatar Jun 18 '23 08:06 Raymvp

Hi, I'm very sorry, but I have a question. Does Censorship=1 mean that the survival time of the patient is fully observed? And in TCGA data, how to judge the censorship of each patient?

Luxiaowen45 avatar Jun 22 '23 10:06 Luxiaowen45

Hi @Luxiaowen45 - censorship=1 means the event was not observed.

Regarding high validation c-Index, let me know if my above explanation makes sense:

Regarding the validation loss function being unstable, I did observe some issues as well. Survival analysis is a very difficult problem that is prone to issues such as 1) high censorship, and 2) differing survival and censorship distributions between train-validation. For cancer types such as UCEC which has less censorship, I do observe that the validation c-index increases with epoch training, but not for BRCA which is more difficult.

Richarizardd avatar Jun 22 '23 22:06 Richarizardd

hi @Richarizardd I'm using the split that you provided in your files, but I got a much higher value on the validation set. It's much higher than 0.585 reported in the paper. According to your explanation, if we are using the same split, the cindex should be the same, right? Could it be possible that there's a bug in the code causing this? But currently, I don't have the capacity to debug it.

Raymvp avatar Jun 24 '23 10:06 Raymvp

You mentioned that updating torch_geometric would improve the classification performance. So, could the situation where the c-index on our experimental validation set is too high also be caused by the torch_geometric package?

Raymvp avatar Jul 26 '23 14:07 Raymvp

I met the same problem,the val_cindex sometimes was up to 0.80+.That's impossible.Could your team give some reasons,my dear Harvard professor?

ljhOfGithub avatar Aug 11 '23 01:08 ljhOfGithub

Hi @ljhOfGithub @Raymvp - after some careful debugging, could you try removing this line in the validation and summary script, to see if the results changed?

https://github.com/mahmoodlab/Patch-GCN/blob/00207f03e541f1b26ef2bf6103502b30d9422700/utils/core_utils.py#L333

Before I released my code on GitHub, I ran this code on a machine with lower GPU memory, and found an issue where too large bag sizes caused the evaluation to fail (was not included in the results presented in the paper - else I would also have had the same issues with high c-index). I think this line may be causing the code to report high c-indices, as cases with a very large bag size would be given a "high risk", e.g. - risk of 0.

Richarizardd avatar Aug 11 '23 03:08 Richarizardd

Hi @ljhOfGithub @Raymvp - after some careful debugging, could you try removing this line in the validation and summary script, to see if the results changed?嗨-经过一些仔细的调试,你能尝试删除验证和摘要脚本中的这一行,看看结果是否改变了?

https://github.com/mahmoodlab/Patch-GCN/blob/00207f03e541f1b26ef2bf6103502b30d9422700/utils/core_utils.py#L333

Before I released my code on GitHub, I ran this code on a machine with lower GPU memory, and found an issue where too large bag sizes caused the evaluation to fail (was not included in the results presented in the paper - else I would also have had the same issues with high c-index). I think this line may be causing the code to report high c-indices, as cases with a very large bag size would be given a "high risk", e.g. - risk of 0.在我在GitHub上发布我的代码之前,我在GPU内存较低的机器上运行了这段代码,发现了一个问题,即太大的袋子尺寸导致评估失败(没有包括在论文中给出的结果中-否则我也会遇到同样的问题)。我认为这一行可能导致代码报告高c指数,因为袋尺寸非常大的情况将被给予“高风险”,例如:风险为0。

It works.I use the following code:

if isinstance(data_WSI, torch_geometric.data.Batch):
            # if data_WSI.x.shape[0] > 100_000:
                # continue
            pass

And I get the following results: image Thank you very much!

ljhOfGithub avatar Aug 14 '23 01:08 ljhOfGithub

Hi @ljhOfGithub @Raymvp - after some careful debugging, could you try removing this line in the validation and summary script, to see if the results changed?嗨-经过一些仔细的调试,你能尝试删除验证和摘要脚本中的这一行,看看结果是否改变了? https://github.com/mahmoodlab/Patch-GCN/blob/00207f03e541f1b26ef2bf6103502b30d9422700/utils/core_utils.py#L333

Before I released my code on GitHub, I ran this code on a machine with lower GPU memory, and found an issue where too large bag sizes caused the evaluation to fail (was not included in the results presented in the paper - else I would also have had the same issues with high c-index). I think this line may be causing the code to report high c-indices, as cases with a very large bag size would be given a "high risk", e.g. - risk of 0.在我在GitHub上发布我的代码之前,我在GPU内存较低的机器上运行了这段代码,发现了一个问题,即太大的袋子尺寸导致评估失败(没有包括在论文中给出的结果中-否则我也会遇到同样的问题)。我认为这一行可能导致代码报告高c指数,因为袋尺寸非常大的情况将被给予“高风险”,例如:风险为0。

It works.I use the following code:

if isinstance(data_WSI, torch_geometric.data.Batch):
            # if data_WSI.x.shape[0] > 100_000:
                # continue
            pass

And I get the following results: image Thank you very much!

It seems like you have successfully run the code; however, I encountered a situation of insufficient GPU memory. I am using a single 4090 card; may I ask what graphics card are you using to run it? Do you know how to run in parallel using multiple graphics cards to avoid the issue of insufficient memory?

Raymvp avatar Sep 25 '23 14:09 Raymvp

Hi @ljhOfGithub @Raymvp - after some careful debugging, could you try removing this line in the validation and summary script, to see if the results changed?嗨-经过一些仔细的调试,你能尝试删除验证和摘要脚本中的这一行,看看结果是否改变了? https://github.com/mahmoodlab/Patch-GCN/blob/00207f03e541f1b26ef2bf6103502b30d9422700/utils/core_utils.py#L333

Before I released my code on GitHub, I ran this code on a machine with lower GPU memory, and found an issue where too large bag sizes caused the evaluation to fail (was not included in the results presented in the paper - else I would also have had the same issues with high c-index). I think this line may be causing the code to report high c-indices, as cases with a very large bag size would be given a "high risk", e.g. - risk of 0.在我在GitHub上发布我的代码之前,我在GPU内存较低的机器上运行了这段代码,发现了一个问题,即太大的袋子尺寸导致评估失败(没有包括在论文中给出的结果中-否则我也会遇到同样的问题)。我认为这一行可能导致代码报告高c指数,因为袋尺寸非常大的情况将被给予“高风险”,例如:风险为0。

It works.I use the following code:下面是一个例子:

if isinstance(data_WSI, torch_geometric.data.Batch):
            # if data_WSI.x.shape[0] > 100_000:
                # continue
            pass

And I get the following results: 我得到以下结果: image Thank you very much! 非常感谢您!

It seems like you have successfully run the code; however, I encountered a situation of insufficient GPU memory. I am using a single 4090 card; may I ask what graphics card are you using to run it? Do you know how to run in parallel using multiple graphics cards to avoid the issue of insufficient memory?看起来你已经成功地运行了代码;但是,我遇到了GPU内存不足的情况。我用的是单卡4090;请问您用的是什么显卡?你知道如何使用多个显卡并行运行,以避免内存不足的问题吗?

I used the A800.

ljhOfGithub avatar Sep 26 '23 02:09 ljhOfGithub