Question
Hi,
May I ask you for some advice, in cases when I see that the selected model by pureCN is off e.g, lower purity than expected or mutations expected to be clonal are subclonal, or in tricky samples like FFPE, or lower coverage... What are the most important parameters from the output that I should consider when selecting a new model that would fit more? GoF?
Thank you.
Here some things I look first: https://github.com/lima1/PureCN/issues/230#issuecomment-1079140850
with a bit of experience, looking at the BAF plot quickly shows you if it looks good.
Thank you for your advice and quick response. I had a look at this comment actually before, but in case I have more samples to look at it is quite time-consuming to go through BAF plots, and for a few models for each sample, especially if I am not very experienced. Maybe from RDS somehow to look at the BAFs?
Really depends what you need. We sequence thousands of samples each year and don't go through them.
Some cancer types such as lung are more difficult than others due to high amounts of sub-clonal alterations, but PureCN should pick the correct solution in vast majority of cases (>90%). When purity is low it's biased towards low ploidy, with high noise towards high ploidy. For mutation analysis, you need to double check since it's highly sensitive to wrong purity/ploidy. No way around unfortunately (and what everybody using those algorithms, ABSOLUTE, ASCAT,..., does).
Mutation analysis is tricky in low purity and/or noisy samples, so you could filter for purity > 35-40% and log ratio standard deviation < 0.25 or so.
Thanks. I am working with colorectal cancer and using PureCN for mutation analysis. Very often is happening that log-ratio standard deviation > 0,4 in FFPE especially, but true that in more than 80% it pick the right solution, the problem is in the rest of 10-20%. I am not sure how pureCN exactly finally rank the models?