dsmil-wsi icon indicating copy to clipboard operation
dsmil-wsi copied to clipboard

How to deal with multi-label problem?

Open LITTLEKKKK opened this issue 4 years ago • 4 comments

Some cancer may have different parts in one slide because of tumor heterogeneity. Does this code solve the multi-label problem? Or how to deal with multi-label problem by using MIL?

LITTLEKKKK avatar Feb 01 '21 07:02 LITTLEKKKK

The code works with multi-class labels. The labels need to be presented as distributed encoded binary vectors. For example, [0, 0, 1], [0, 1, 0], [1, 0, 0] each encodes one of the three classes. The max-pooling branch will pool the instances along with each digit of the class vector, the attentions are computed separately for each class, and the resulted bag representation will have a number of entries equal to the number of classes. This bag representation is then projected by a 1D convolution. Please check the example for TCGA lung cancer dataset.

binli123 avatar Feb 01 '21 21:02 binli123

Thanks for your answer. I still have some questions. There are different types of patches in a slide, and we choose the highest-rank type as the slide-level label. How does the code (as you say [0,0,1], [0,1,0], [1,0,0]) work? I still don't know how it works. Could you explain in detail? Thanks.

LITTLEKKKK avatar Feb 04 '21 08:02 LITTLEKKKK

Thanks for your answer. I still have some questions. There are different types of patches in a slide, and we choose the highest-rank type as the slide-level label. How does the code (as you say [0,0,1], [0,1,0], [1,0,0]) work? I still don't know how it works. Could you explain in detail? Thanks.

For an example of three subtypes of cancer, the labels should be prepared as: [1, 0, 0] -- if the slide contains subtype 1 [0, 1, 0] -- if the slide contains subtype 2 [0, 0, 1] -- if the slide contains subtype 3 [1, 1, 0] -- if the slide contains both subtype 1 and subtype 2 ... [0, 0, 0] -- healthy slide

It might still work if the slide is labeled only according to the highest-rank type. For example, subtype 1 is higher-rank than subtype 2 such that a slide contains both subtype 1 and subtype 2 is labeled also as [1, 0, 0] (not [1, 1, 0]).

binli123 avatar Feb 04 '21 15:02 binli123

Thanks a lot. : )

LITTLEKKKK avatar Feb 05 '21 07:02 LITTLEKKKK