dsmil-wsi
dsmil-wsi copied to clipboard
TCGA Dataset Training and Testing Distributions
Hi, could you please share with me the distribution of slides used for training and testing in the TCGA dataset, along with their respective labels?
I noticed that it's mentioned here "We randomly split the WSIs into 840 training slides and 210 testing slides (4 low-quality corrupted slides are discarded)". However, upon examining the TEST_ID.csv
file from this link, I observed that there are 214 testing slides. Could you provide clarification which slides were discarded? And also which slides are used for training? Thank you!
@bryanwong17, I went through this. See the results of my investigation in my README file for downloading TCGA.