DiaNN
DiaNN copied to clipboard
Running DiaNN sample by sample
Hello,
I am working with a large cohort of DIA samples and would like to run DiaNN sample by sample if possible. I was wondering how match between runs works when running DiaNN sample by sample. Also, how does DiaNN handle protein inference when samples are processed individually? Is it possible that the protein groups will change between samples?
Thanks
Hi Lindsey,
Yes, it's possible but please see
- Details in the docs on incremental processing
- How the quantms pipeline is implemented (we recently preprinted, there's a thread on their github about it)
In general, the only scenario in which I would think this can be really useful is when a company needs to handle incrementally incoming data from large patient cohorts. In all other cases, probably easier to just process in one go. Also, the time-consuming lib-free step can be done on a subset of the samples to create a library, and then this empirical library can be used on the whole dataset without MBR.
Protein groups are only inferred by DIA-NN when it aggregates all .quant files together. Or (good option actually) you can create them when generating an empirical library from a subset of data. Then need to replace all columns in the resulting library that have protein sequence IDs in it with the contents of the Protein.Group column of the respective report.
Best, Vadim
Hello @vdemichev, would that imply that the Dia-NN protein inference is not deterministic? If I run Dia-NN on two separate samples containing the same peptide sequence, will it always map the peptide to the same protein?
Thanks